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207 Human Secreted Proteins 



Field of the Invention 



This invention relates to newly identified polynucleotides and the polypeptides 
encoded by these polynucleotides, uses of such polynucleotides and polypeptides, and 
5 their production. 



Unlike bacterium, which exist as a single compartment surrounded by a 
membrane, human cells and other eucaryotes are subdivided by membranes into many 
functionally distinct compartments. Each membrane-bounded compartment, or 
10 organelle, contains different proteins essential for the function of the organelle. The cell 



proteins to particular cellular organelles. 

One type of sorting signal, called a signal sequence, a signal peptide, or a leader 
sequence, directs a class of proteins to an organelle called the endoplasmic reticulum 
15 (ER). The ER separates the membrane-bounded proteins from all other types of 

proteins. Once localized to the ER, both groups of proteins can be further directed to 
another organelle called the Golgi apparatus. Here, the Golgi distributes the proteins to 
vesicles, including secretory vesicles, the cell membrane, lysosomes, and the other 
organelles. 

20 Proteins targeted to the ER by a signal sequence can be released into the 

extracellular space as a secreted protein. For example, vesicles containing secreted 
proteins can fuse with the cell membrane and release their contents into the extracellular 
space - a process called exocytosis. Exocytosis can occur constitutively or after receipt 
of a triggering signal. In the latter case, the proteins are stored in secretory vesicles (or 

25 secretory granules) until exocytosis is triggered. Similarly, proteins residing on the cell 
membrane can also be secreted into the extracellular space by proteolytic cleavage of a 
"linker" holding the protein to the membrane. 

Despite the great progress made in recent years, only a small number of genes 
encoding human secreted proteins have been identified. These secreted proteins include 

30 the commercially valuable human insulin, interferon. Factor VIII, human growth 
hormone, tissue plasminogen activator, and erythropoeitin. Thus, in light of the 
pervasive role of secreted proteins in human physiology, a need exists for identifying 
and characterizing novel human secreted proteins and the genes that encode them. This 
knowledge will allow one to detect, to treat, and to prevent medical disorders by using 

35 secreted proteins or the genes that encode them. 



Background of the Invention 



uses "sorting signals," which are amino acid motifs located within the protein, to target 
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Summary of the Invention 

The present invention relates to novel polynucleotides and the encoded 
polypeptides. Moreover, the present invention relates to vectors, host cells, antibodies, 
5 and recombinant methods for producing the polypeptides and polynucleotides. Also 
provided are diagnostic methods for detecting disorders related to the polypeptides, and 
therapeutic methods for treating such disorders. The invention further relates to 
screening methods for identifying binding partners of the polypeptides. 

10 Detailed Description 

Definitions 

The following definitions are provided to facilitate understanding of certain 
terms used throughout this specification. 

In the present invention, "isolated" refers to material removed from its original 

15 environment (e.g., the natural environment if it is naturally occurring), and thus is 
altered "by the hand of man" from its natural state. For example, an isolated 
polynucleotide could be part of a vector or a composition of matter, or could be 
contained within a cell, and still be "isolated" because that vector, composition of 
matter, or particular cell is not the original environment of the polynucleotide. 

20 In the present invention, a "secreted" protein refers to those proteins capable of 

being directed to the ER, secretory vesicles, or the extracellular space as a result of a 
signal sequence, as well as those proteins released into the extracellular space without 
necessarily containing a signal sequence. If the secreted protein is released into the 
extracellular space, the secreted protein can undergo extracellular processing to produce 

25 a "mature" protein. Release into the extracellular space can occur by many 
mechanisms, including exocytosis and proteolytic cleavage. 

As used herein , a "polynucleotide" refers to a molecule having a nucleic acid 
sequence contained in SEQ ID NO:X or the cDNA contained within the clone deposited 
with the ATCC. For example, the polynucleotide can contain the nucleotide sequence 

30 of the full length cDNA sequence, including the 5' and 3' untranslated sequences, the 
coding region, with or without the signal sequence, the secreted protein coding region, 
as well as fragments, epitopes, domains, and variants of the nucleic acid sequence. 
Moreover, as used herein, a "polypeptide" refers to a molecule having the translated 
amino acid sequence generated from the polynucleotide as broadly defined. 

35 In the present invention, the full length sequence identified as SEQ ID NO:X 

was often generated by overlapping sequences contained in multiple clones (contig 
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analysis). A representative clone containing all or most of the sequence for SEQ ID 
NO:X was deposited with the American Type Culture Collection ("ATCC"). As 
shown in Table 1, each clone is identified by a cDNA Clone ID (Identifier) and the 
ATCC Deposit Number. The ATCC is located at 10801 University Boulevard, 
5 Manassas, Virginia 201 10-2209, USA. The ATCC deposit was made pursuant to the 
terms of the Budapest Treaty on the international recognition of the deposit of 
microorganisms for purposes of patent procedure. 

A "polynucleotide" of the present invention also includes those polynucleotides 
capable of hybridizing, under stringent hybridization conditions, to sequences contained 
10 in SEQ ID NO:X, the complement thereof, or the cDNA within the clone deposited with 

the ATCC. "Stringent hybridization conditions" refers to an overnight incubation at 42° 
C in a solution comprising 50% formamide, 5x SSC (750 mM NaCl, 75 mM sodium 
citrate), 50 mM sodium phosphate (pH 7.6), 5x Denhardt's solution, 10% dextran 
sulfate, and 20 plg/ml denatured, sheared salmon sperm DNA, followed by washing the 
1 5 filters in 0.1 x SSC at about 65°C. 

Also contemplated are nucleic acid molecules that hybridize to the 
polynucleotides of the present invention at lower stringency hybridization conditions. 
Changes in the stringency of hybridization and signal detection are primarily 
accomplished through the manipulation of formamide concentration (lower percentages 
20 of formamide result in lowered stringency); salt conditions, or temperature. For 

example, lower stringency conditions include an overnight incubation at 37°C in a 

solution comprising 6X SSPE (20X SSPE = 3M NaCl; 0.2M NaH 2 P0 4 ; 0.02M EDTA, 
pH 7.4), 0.5% SDS, 30% formamide, 100 ug/ml salmon sperm blocking DNA; 

followed by washes at 50°C with 1XSSPE, 0.1% SDS. In addition, to achieve even 

25 lower stringency, washes performed following stringent hybridization can be done at 
higher salt concentrations (e.g. 5X SSC). 

Note that variations in the above conditions may be accomplished through the 
inclusion and/or substitution of alternate blocking reagents used to suppress 
background in hybridization experiments. Typical blocking reagents include 

30 Denhardt's reagent, BLOTTO, heparin, denatured salmon sperm DNA, and 

commercially available proprietary formulations. The inclusion of specific blocking 
reagents may require modification of the hybridization conditions described above, due 
to problems with compatibility. 

Of course, a polynucleotide which hybridizes only to polyA+ sequences (such 

35 as any 3' terminal polyA+ tract of a cDNA shown in the sequence listing), or to a 
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complementary' stretch of T (or U) residues, would not he included in the definition of 
"polynucleotide," since such a polynucleotide would hybridize to any nucleic acid 
molecule containing a poly (A) stretch or the complement thereof (e.g., practically any 
double-stranded cDNA clone). 
5 The polynucleotide of the present invention can be composed of any 

polyribonucleotide or polydeoxribonucleotide, which may be unmodified RNA or DNA 
or modified RNA or DNA. For example, polynucleotides can be composed of single - 
and double-stranded DNA, DNA that is a mixture of single- and double-stranded 
regions, single- and double-stranded RNA, and RNA that is mixture of single- and 

10 double-stranded regions, hybrid molecules comprising DNA and RNA that may be 

single-stranded or, more typically, double-stranded or a mixture of single- and double- 
stranded regions. In addition, the polynucleotide can be composed of triple-stranded 
regions comprising RNA or DNA or both RNA and DNA. A polynucleotide may also 
contain one or more modified bases or DNA or RNA backbones modified for stability 

15 or for other reasons. "Modified" bases include, for example, tritylated bases and 
unusual bases such as inosine. A variety of modifications can be made to DNA and 
RNA; thus, "polynucleotide" embraces chemically, enzymatically, or metabolically 
modified forms. 

The polypeptide of the present invention can be composed of amino acids joined 

20 to each other by peptide bonds or modified peptide bonds, i.e., peptide isostercs, and 
may contain amino acids other than the 20 gene-encoded amino acids. The 
polypeptides may be modified by either natural processes, such as posttranslational 
processing, or by chemical modification techniques which are well known in the art. 
Such modifications are well described in basic texts and in more detailed monographs, 

25 as well as in a voluminous research literature. Modifications can occur anywhere in a 
polypeptide, including the peptide backbone, the amino acid side-chains and the amino 
or carboxyl termini. It will be appreciated that the same type of modification may be 
present in the same or varying degrees at several sites in a given polypeptide. Also, a 
given polypeptide may contain many types of modifications. Polypeptides may be 

30 branched , for example, as a result of ubiquitination, and they may be cyclic, with or 
without branching. Cyclic, branched, and branched cyclic polypeptides may result 
from posttranslation natural processes or may be made by synthetic methods. 
Modifications include acetylation, acylation, ADP-ribosylation, amidation, covalent 
attachment of flavin, covalent attachment of a heme moiety, covalent attachment of a 

35 nucleotide or nucleotide derivative, covalent attachment of a lipid or lipid derivative, 
covalent attachment of phosphotidylinositol, cross-linking, cyclization, disulfide bond 
formation, demethylation, formation of covalent cross-links, formation of cysteine. 
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formation of pyroglutamate, formylation, gamma-carboxylation. glycosylation, GPI 
anchor formation, hydroxylation, iodination, methylation, myristoylation, oxidation, 
pegylation, proteolytic processing, phosphorylation, prenylation, racemization, 
selenoylation. sulfation, transfer-RNA mediated addition of amino acids to proteins 
5 such as arginylation, and ubiquitination. (See, for instance, PROTEINS - 

STRUCTURE AND MOLECULAR PROPERTIES, 2nd Ed., T. E. Creighton, W. 
H. Freeman and Company, New York (1993); POSTTRANSLATIONAL 
COVALENT MODIFICATION OF PROTEINS, B. C. Johnson, Ed., Academic 
Press, New York, pgs. 1-12 (1983); Seifteret al., Meth Enzymol 182:626-646 ( 1990); 
10 Rattan et al., Ann NY Acad Sci 663:48-62 (1992).) 

"SEQ ID NO:X" refers to a polynucleotide sequence while "SEQ ID NO: Y" 
refers to a polypeptide sequence, both sequences identified by an integer specified in 
Table 1. 

"A polypeptide having biological activity" refers to polypeptides exhibiting 
15 activity similar, but not necessarily identical to, an activity of a polypeptide of the 

present invention, including mature forms, as measured in a particular biological assay, 
with or without dose dependency. In the case where dose dependency does exist, it 
need not be identical to that of the polypeptide, but rather substantially similar to the 
dose-dependence in a given activity as compared to the polypeptide of the present 
20 invention (i.e., the candidate polypeptide will exhibit greater activity or not more than 
about 25-fold less and, preferably, not more than about tenfold less activity, and most 
preferably, not more than about three-fold less activity relative to the polypeptide of the 
present invention.) 

25 Polynucleotides and Polypeptides of the Invention 

FEATURES OF PROTEIN ENCODED BY GENE NO: 1 

This gene is expressed primarily in melanocytes and, to a lesser extent, in 
testes, ovary, kidney and other tissues. 

30 Therefore, polynucleotides and polypeptides of the invention are useful as 

reagents for differential identification of the tissue(s) or cell type(s) present in a 
biological sample and for diagnosis of diseases and conditions, which include, but are 
not limited to, cancer, disorders of neural crest derived cells including pigmentation 
defects, melanoma, reproductive organ defects, and defects of the kidney. Similarly, 

35 polypeptides and antibodies directed to these polypeptides are useful in providing 

immunological probes for differential identification of the tissue(s) or cell type(s). For 
a number of disorders of the above tissues or cells, particularly of the skin. 
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reproductive, and renal systems, expression of this gene at significantly higher or lower 
levels may be routinely detected in certain tissues (e.g.. cancerous and wounded 
tissues) or bodily fluids (e.g.. serum, plasma, urine, synovial fluid or spinal fluid) or 
another tissue or cell sample taken from an individual having such a disorder, relative to 
5 the standard gene expression level, i.e., the expression level in healthy tissue or bodily 
fluid from an individual not having the disorder. 

The tissue distribution indicates that polynucleotides and polypeptides 
corresponding to this gene are useful for treating disorders that arise from alterations in 
the number or fate of neural crest derived cells including cancers such as melanoma and 
10 defects of the developing reproductive system. 

FEATURES OF PROTEIN ENCODED BY GENE NO: 2 

This gene is expressed primarily in infant brain and fetal lung. 
Therefore, polynucleotides and polypeptides of the invention are useful as 

15 reagents for differential identification of the tissue(s) or cell type(s) present in a 

biological sample and for diagnosis of diseases and conditions, which include, but are 
not limited to, developmental disorders of the brain or lung. Similarly, polypeptides and 
antibodies directed to these polypeptides are useful in providing immunological probes 
for differential identification of the tissue(s) or cell type(s). For a number of disorders 

20 of the above tissues or cells, particularly of the central nervous and pulmonary systems, 
expression of this gene at significantly higher or lower levels may be routinely detected 
in certain tissues (e.g., cancerous and wounded tissues) or bodily fluids (e.g., serum, 
plasma, urine, synovial fluid or spinal fluid) or another tissue or cell sample taken from 
an individual having such a disorder, relative to the standard gene expression level, i.e., 

25 the expression level in healthy tissue or bodily fluid from an individual not having the 
disorder. 

The tissue distribution indicates that polynucleotides and polypeptides 
corresponding to this gene are useful for treating or diagnosing disorders associated 
with abnormal proliferation of cells in the Central nervous system and developing lung. 

30 

FEATURES OF PROTEIN ENCODED BY GENE NO: 3 

This gene is expressed primarily in breast lymph node and to a lesser extent in 
ovarian cancer and chondrosarcoma. 

Therefore, polynucleotides and polypeptides of the invention are useful as 
35 reagents for differential identification of the tissue(s) or cell type(s) present in a 

biological sample and for diagnosis of diseases and conditions, which include, but are 
not limited to, immune responses such as inflammation or immune surveillance for 
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tumors. This gene may be important for inflammatory responses associated with 
tumors. Similarly, polypeptides and antibodies directed to these polypeptides are useful 
in providing immunological probes for differential identification of the tissue(s) or cell 
type(s). For a number of disorders of the above tissues or cells, particularly of the 
5 immune system, expression of this gene at significantly higher or lower levels may be 
routinely detected in certain tissues (e.g., cancerous and wounded tissues) or bodily 
fluids (e.g., serum, plasma, urine, synovial fluid or spinal fluid) or another tissue or 
cell sample taken from an individual having such a disorder, relative to the standard 
gene expression level, i.e., the expression level in healthy tissue or bodily fluid from an 
10 individual not having the disorder. Preferred epitopes include those comprising a 

sequence shown in SEQ ID NO: 236 as residues: Lys-45 to Val-50, Lys-69 to Arg-76. 

The tissue distribution indicates that polynucleotides and polypeptides 
corresponding to this gene are useful for treatment or diagnosis of immune responses 
including those associated with tumor-induced inflammation. 

15 

FEATURES OF PROTEIN ENCODED BY GENE NO: 4 

This gene is expressed primarily in T-cells and T-cell lymphomas. 
Therefore, polynucleotides and polypeptides of the invention are useful as 
reagents for differential identification of the tissue(s) or cell type(s) present in a 

20 biological sample and for diagnosis of diseases and conditions, which include, but are 
not limited to, immunilogical diseases involving T-cells such as inflammation, 
autoimmunity, and cancers including T-cell lymphomas. Similarly, polypeptides and 
antibodies directed to these polypeptides are useful in providing immunological probes 
for differential identification of the tissue(s) or cell type(s). For a number of disorders 

25 of the above tissues or cells, particularly of T-cells and other cells of the immune 

system, expression of this gene at significantly higher or lower levels may be routinely 
detected in certain tissues (e.g., cancerous and wounded tissues) or bodily fluids (e.g., 
serum, plasma, urine, synovial fluid or spinal fluid) or another tissue or cell sample 
taken from an individual having such a disorder, relative to the standard gene 

30 expression level, i.e., the expression level in healthy tissue or bodily fluid from an 
individual not having the disorder. 

The tissue distribution indicates that polynucleotides and polypeptides 
corresponding to this gene are useful for diagnosing and treating T-cell based disorders 
such as inflammatory diseases, autoimmune disease and tumors including T-cell 

35 lymphomas. 
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FEATURES OF PROTEIN ENCODED BY GENE NO: 5 

This gene is expressed primarily in activated monocytes. 
Therefore, polynucleotides and polypeptides of the invention are useful as 
reagents for differential identification of the tissue(s) or cell type(s) present in a 
5 biological sample and for diagnosis of diseases and conditions, which include, but are 
not limited to, inflammation, autoimmunity, infection, or disorders involving activation 
of monocytes. Similarly, polypeptides and antibodies directed to these polypeptides are 
useful in providing immunological probes for differential identification of the tissue(s) 
or cell type(s). For a number of disorders of the above tissues or cells, particularly of 
10 the immune system, expression of this gene at significantly higher or lower levels may 
be routinely detected in certain tissues (e.g., cancerous and wounded tissues) or bodily 
fluids (e.g., serum, plasma, urine, synovial fluid or spinal fluid) or another tissue or 
cell sample taken from an individual having such a disorder, relative to the standard 
gene expression level, i.e., the expression level in healthy tissue or bodily fluid from an 
15 individual not having the disorder. Preferred epitopes include those comprising a 
sequence shown in SEQ ID NO: 238 as residues: Asp- 19 to Arg-31. 

The tissue distribution indicates that polynucleotides and polypeptides 
corresponding to this gene are useful for diagnosing or treating diseases that result in 
activation of monocytes including infections, inflammatory responses or autoimmune 
20 diseases. 

FEATURES OF PROTEIN ENCODED BY GENE NO: 6 

The translation product of this gene shares sequence homology with terminal 
deoxynucleotidyltransferase which is thought to be important in catalyzing the 

25 elongation of oligo- or polydeoxynucleotide chains. 

This gene is expressed primarily in activated human neutrophils. 
Therefore, polynucleotides and polypeptides of the invention are useful as 
reagents for differential identification of the tissue(s) or cell type(s) present in a 
biological sample and for diagnosis of diseases and conditions which include, but are 

30 not limited to, cancer, particularly those of the blood such as leukemia and deficiencies 
in neutrophils such as neutropenia Similarly, polypeptides and antibodies directed to 
these polypeptides are useful in providing immunological probes for differential 
identification of the tissue(s) or cell type(s). For a number of disorders of the above 
tissues or cells, particularly of the cardiovascular system, expression of this gene at 

35 significantly higher or lower levels may be routinely detected in certain tissues (e.g., 
cancerous and wounded tissues) or bodily fluids (e.g., serum, plasma, urine, synovial 
fluid or spinal fluid) or another tissue or cell sample taken from an individual having 
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such a disorder, relative to the standard gene expression level, i.e., the expression level 
in healthy tissue or bodily fluid from an individual not having the disorder. 

The tissue distribution and homology to terminal deoxynucleotidyltransferase 
indicates that polynucleotides and polypeptides corresponding to this gene are useful for 
5 treatment and differential diagnosis of acute leukemia's. Alternatively, this gene may 
function in the proliferation of neutrophils and be useful as a treatment for neutropenia, 
for example, following neutropenia as a result of chemotherapy. 

FEATURES OF PROTEIN ENCODED BY GENE NO: 7 

10 The contig exhibits a reasonable homology to the human chorionic gonadotropic 

(HCG) analogue-GT beta-subunit as disclosed in U.S. Patent No. 5,508,261 and PCT 
Publication No. WO 92/22568. There is a high degree of conservation of the 
structurally important cysteine residues in these identities. 

This gene is expressed primarily in IL-1 and LPS induced neutrophils. 

15 Therefore, polynucleotides and polypeptides of the invention are useful as 

reagents for differential identification of the tissue(s) or cell type(s) present in a 
biological sample and for diagnosis of diseases and conditions, which include, but are 
not limited to, diseases of the immune system, including inflammatory diseases and 
allergies. Similarly, polypeptides and antibodies directed to these polypeptides are 

20 useful in providing immunological probes for differential identification of the tissue(s) 
or cell type(s). For a number of disorders of the above tissues or cells, particularly of 
the immune system, expression of this gene at significantly higher or lower levels may 
be routinely detected in certain tissues (e.g., cancerous and wounded tissues) or bodily 
fluids (e.g., serum, plasma, urine, synovial fluid or spinal fluid) or another tissue or 

25 cell sample taken from an individual having such a disorder, relative to the standard 

gene expression level, i.e., the expression level in healthy tissue or bodily fluid from an 
individual not having the disorder. 

The tissue distribution indicates that polynucleotides and polypeptides 
corresponding to this gene are useful for treatment/diagnosis of diseases of the immune 

30 system since expression is primarily in neutrophils, and may be useful as a growth 
factor for the differentiation or proliferation of neutrophils for the treatment of 
neutropenia following chemotherapy. 

FEATURES OF PROTEIN ENCODED BY GENE NO: 8 

35 This gene is expressed primarily in IL-1- and LPS-induced neutrophils. 

Therefore, polynucleotides and polypeptides of the invention are useful as 
reagents for differential identification of the tissue(s) or cell type(s) present in a 
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biological sample and lor diagnosis of diseases and conditions, which include, but are 
not limited to. diseases of the immune system, including inflammatory' diseases and 
allergies. Similarly, polypeptides and antibodies directed to these polypeptides are 
useful in providing immunological probes for differential identification of the tissue(s) 
5 or cell type(s). For a number of disorders of the above tissues or cells, particularly of 
the immune system, expression of this gene at significantly higher or lower levels may 
be routinely detected in certain tissues (e.g., cancerous and wounded tissues) or bodily 
fluids (e.g., serum, plasma, urine, synovial fluid or spinal fluid) or another tissue or 
cell sample taken from an individual having such a disorder, relative to the standard 

10 gene expression level, i.e., the expression level in healthy tissue or bodily fluid from an 
individual not having the disorder. Preferred epitopes include those comprising a 
sequence shown in SEQ ID NO: 241 as residues: Ser-14 to Pro-22, Leu-43 to Val-53. 

The tissue distribution indicates that polynucleotides and polypeptides 
corresponding to this gene are useful for treatment and diagnosis of diseases of the 

15 immune system since expression is primarily in neutrophils, and may be useful as a 
growth factor for the differentiation or proliferation of neutrophils for the treatment of 
neutropenia following chemotherapy. 

FEATURES OF PROTEIN ENCODED BY GENE NO: 9 

20 This gene is expressed primarily in IL-1 and LPS induced neutrophils. 

Therefore, polynucleotides and polypeptides of the invention are useful as 
reagents for differential identification of the tissue(s) or cell type(s) present in a 
biological sample and for diagnosis of diseases and conditions, which include, but are 
not limited to, diseases of the immune system, including inflammatory diseases and 

25 allergies. Similarly, polypeptides and antibodies directed to these polypeptides are 

useful in providing immunological probes for differential identification of the tissue(s) 
or cell type(s). For a number of disorders of the above tissues or cells, particularly of 
the immune system, expression of this gene at significantly higher or lower levels may 
be routinely detected in certain tissues (e.g., cancerous and wounded tissues) or bodily 

30 fluids (e.g., serum, plasma, urine, synovial fluid or spinal fluid) or another tissue or 
cell sample taken from an individual having such a disorder, relative to the standard 
gene expression level, i.e., the expression level in healthy tissue or bodily fluid from an 
individual not having the disorder. Preferred epitopes include those comprising a 
sequence shown in SEQ ID NO: 242 as residues: Tyr-22 to His-35. 

35 The tissue distribution indicates that polynucleotides and polypeptides 

corresponding to this gene are useful for treatment/diagnosis of diseases of the immune 
system since expression is primarily in neutrophils, and may be useful as a growth 
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factor for the differentiation or proliferation of neutrophils for the treatment of 
neutropenia following chemotherapy. 

FEATURES OF PROTEIN ENCODED BY GENE NO: 10 

5 This gene is expressed primarily in activated T-cells and to a lesser extent in 

endothelial cells. 

Therefore, polynucleotides and polypeptides of the invention are useful as 
reagents for differential identification of the tissue(s) or cell type(s) present in a 
biological sample and for diagnosis of diseases and conditions, which include, but are 

10 not limited to, immune dysfunctions including cancer of the T lymphocytes and 
autoimmune disorders and inflammation. Similarly, polypeptides and antibodies 
directed to these polypeptides are useful in providing immunological probes for 
differential identification of the tissue(s) or cell type(s). For a number of disorders of 
the above tissues or cells, particularly of the immune system, expression of this gene at 

15 significantly higher or lower levels may be routinely detected in certain tissues (e.g., 
cancerous and wounded tissues) or bodily fluids (e.g., serum, plasma, urine, synovial 
fluid or spinal fluid) or another tissue or cell sample taken from an individual having 
such a disorder, relative to the standard gene expression level, i.e., the expression level 
in healthy tissue or bodily fluid from an individual not having the disorder. 

20 The tissue distribution indicates that polynucleotides and polypeptides 

corresponding to this gene are useful for treatment and diagnosis of immune disorders 
particularly of T-cell origin and may act as a growth factor for particular subsets of T- 
cells such as CD4 positive cells which would make this a useful therapeutic for the 
treatment of HIV and other immune compromising illnesses. 

25 

FEATURES OF PROTEIN ENCODED BY GENE NO: 11 

This gene is expressed primarily in fetal tissue. 

Therefore, polynucleotides and polypeptides of the invention are useful as 
reagents for differential identification of the tissue(s) or cell type(s) present in a 

30 biological sample and for diagnosis of many developmental abnormalities. Similarly, 
polypeptides and antibodies directed to these polypeptides are useful in providing 
immunological probes for differential identification of the tissue(s) or cell type(s). For 
a number of disorders of the above tissues or cells, particularly of the developing fetus, 
expression of this gene at significantly higher or lower levels may be routinely detected 

35 in certain tissues (e.g., cancerous and wounded tissues) or bodily fluids (e.g., serum, 
plasma, urine, synovial fluid or spinal fluid) or another tissue or cell sample taken from 
an individual having such a disorder, relative to the standard gene expression level, i.e., 
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the expression level in healthy tissue or bodily fluid from an individual not having the 
disorder. 

The tissue distribution indicates that polynucleotides and polypeptides 
corresponding to this gene are useful as a growth factor or differentiation factor for 
5 particular cell types in the developing fetus and may be useful in replacement or other 
types of therapy in cases where the gene is expressed aberrantly. 

FEATURES OF PROTEIN ENCODED BY GENE NO: 12 

This gene is expressed primarily in T-cells and to a lesser extent in tumor tissue 
10 including glioblastoma, meningioma, and Wilm s tumor. 

Therefore, polynucleotides and polypeptides of the invention are useful as 
reagents for differential identification of the tissue(s) or cell type(s) present in a 
biological sample and for diagnosis of diseases and conditions, which include, but are 
not limited to, diseases of the immune system including autoimmune conditions such as 
15 rheumatoid arthritis, inflammatory disorders and cancer. Similarly, polypeptides and 
antibodies directed to these polypeptides are useful in providing immunological probes 
for differential identification of the tissue(s) or cell type(s). For a number of disorders 
of the above tissues or cells, particularly of the immune, expression of this gene at 
significantly higher or lower levels may be routinely detected in certain tissues (e.g., 
20 cancerous and wounded tissues) or bodily fluids (e.g., serum, plasma, urine, synovial 
fluid or spinal fluid) or another tissue or cell sample taken from an individual having 
such a disorder, relative to the standard gene expression level, i.e., the expression level 
in healthy tissue or bodily fluid from an individual not having the disorder. Preferred 
epitopes include those comprising a sequence shown in SEQ ID NO: 245 as residues: 
25 Thr-9toSer-14. 

The tissue distribution indicates that polynucleotides and polypeptides 
corresponding to this gene are useful for diagnosis/ modulation of immune function 
disorders, including rheumatoid arthritis and inflammatory responses. 

30 FEATURES OF PROTEIN ENCODED BY GENE NO: 13 

This gene is expressed primarily in placenta and to a lesser extent in fetal liver 
and bone marrow. 

Therefore, polynucleotides and polypeptides of the invention are useful as 
reagents for differential identification of the tissue(s) or cell type(s) present in a 
35 biological sample and for diagnosis of hematological disorders. Similarly, polypeptides 
and antibodies directed to these polypeptides are useful in providing immunological 
probes for differential identification of the tissue(s) or cell type(s). For a number of 
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disorders of the above tissues or cells, particularly of the hematological and immune 
systems, expression of this gene at significantly higher or lower levels may be routinely 
detected in certain tissues (e.g., cancerous and wounded tissues) or bodily fluids (e.g., 
serum, plasma, urine, synovial fluid or spina! fluid) or another tissue or cell sample 
5 taken from an individual having such a disorder, relative to the standard gene 

expression level, i.e., the expression level in healthy tissue or bodily fluid from an 
individual not having the disorder. 

The tissue distribution indicates that polynucleotides and polypeptides 
corresponding to this gene are useful as a growth factor for hematapoietic stem cells or 
10 progenitor cells in the treatment of chemotherapy patients or kidney disease. 

FEATURES OF PROTEIN ENCODED BY GENE NO: 14 

This gene is expressed primarily in stromal cells. 

Therefore, polynucleotides and polypeptides of the invention are useful as 
15 reagents for differential identification of the tissue(s) or cell type(s) present in a 
biological sample and for diagnosis of hematapoietic disorders including cancer, 
neutropenia, anemia, and thrombocytopenia. Similarly, polypeptides and antibodies 
directed to these polypeptides are useful in providing immunological probes for 
differential identification of the tissue(s) or cell type(s). For a number of disorders of 
20 the above tissues or cells, particularly of the hematapoietic and immune, expression of 
this gene at significantly higher or lower levels may be routinely detected in certain 
tissues (e.g., cancerous and wounded tissues) or bodily fluids (e.g., serum, plasma, 
urine, synovial fluid or spinal fluid) or another tissue or cell sample taken from an 
individual having such a disorder, relative to the standard gene expression level, i.e., 
25 the expression level in healthy tissue or bodily fluid from an individual not having the 
disorder. 

The tissue distribution indicates that polynucleotides and polypeptides 
corresponding to this gene are useful as a growth factor for hematapoietic stem cells or 
progenitor cells, in particular following chemotherapy treatment. 

30 

FEATURES OF PROTEIN ENCODED BY GENE NO: 15 

The translation product of this gene shares sequence homology with epsilon- 
COP from Bos taurus which is thought to be important as a component of coatomer, a 
complex of seven proteins, that is the major component of the non-clathrin membrane 
35 coat. Preferred polypeptides encoded by this gene comprise the following amino acid 
sequences: 

MAPPAPGPASGGSGEVDELFDVKNAFYIGSYQQCINEAXXVKLSSPERDVERD 
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VTLYRAYLAQRKFGVVLDEIKPSSAPELQAVRMFADYLAHESRRDSIVAELDRE 
MSRSXLWTNTTTLLMAASIYLH^ 

RLDLARKELKRMQDLDEDATLTQLATAVWSLATGGEKl.QDAYYrFQEMADKCS 
PTLLLLNGQAACHMAQGRWEAAEGLLQEALDKDSGYPETLVNLIVLSQHLGKP 
5 PEVTNRYLSQLKDAHRSHPFIKEYQAKENDFDRLVLQYAPSAEAGPELSGP 
(SEQ ID NO:458); or RDVERDVFLYRAYLAQRKFGVVLDEIKPSSAPELQAVRMF 
ADYIAHESRRDSIVAELDREMSRSXDVTNTTFLLMAASIYLHDQNPDAALRALH 
QGDSLECTAMTVQILLKLDRLDLARKELKRMQDLDEDATLTQLATAWVSLATG 
GEKLQDAYYIFOEMADKCSPTLLLLNGQAACHMAQGRWEAAEGLLQEALDKD 

10 SGYPETLVNLIVLSQHLGKPPEVTNRYLSQLKDAHRSHPFIKEYQAKENDFDRL 
VLQYAPS A (SEQ ID NO:459). 

This gene is expressed primarily in activated monocytes and T-cells, and to a 
lesser extent in multiple other tissues. 

Therefore, polynucleotides and polypeptides of the invention are useful as 

15 reagents for differential identification of the tissue(s) or cell type(s) present in a 

biological sample and for diagnosis of diseases and conditions, which include, but are 
not limited to, immunomodulation, specifically relating to transport problems in these 
cells. Similarly, polypeptides and antibodies directed to these polypeptides are useful in 
providing immunological probes for differential identification of the tissue(s) or cell 

20 type(s). For a number of disorders of the above tissues or cells, particularly of the 

immune, expression of this gene at significantly higher or lower levels may be routinely 
detected in certain tissues (e.g., cancerous and wounded tissues) or bodily fluids (e.g., 
serum, plasma, urine, synovial fluid or spinal fluid) or another tissue or cell sample 
taken from an individual having such a disorder, relative to the standard gene 

25 expression level, i.e., the expression level in healthy tissue or bodily fluid from an 
individual not having the disorder. 

The tissue distribution and homology to epsilon-COP indicates that 
polynucleotides and polypeptides corresponding to this gene are useful for treating 
/diagnosing problems with the cellular transport of proteins that may result in 

30 immunologic dysfunction. 

FEATURES OF PROTEIN ENCODED BY GENE NO: 16 

The translation product of this gene shares sequence homology with an RNA 
helicase which is thought to be important in polynucleotide metabolism. The translation 
35 product of this contig exhibits good homology to the LbeEF4A antigen of Leishmania 
braziliensis. The LbeIF4A antigen, or immunogenic portions of it, can be used to 
induce protective immunity against leishmaniasis, specifically L. donovani, L. chagasi. 
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L. infantum, L. major, L. braziliensis, L. panamensis, L. tropica and L. guyanensis. It 
can also he used diagnostically to detect Leishmania infection or to stimulate a cellular 
and/or humoral immune response or to stimulate the production of interieukin-12. 
This gene is expressed primarily in colon cancer and to a lesser extent in 
5 pituitary. 

Therefore, polynucleotides and polypeptides of the invention are useful as 
reagents for differential identification of the tissue(s) or cell type(s) present in a 
biological sample and for diagnosis of cancers particularly of the colon. Similarly, 
polypeptides and antibodies directed to these polypeptides are useful in providing 

10 immunological probes for differential identification of the tissue(s) or cell type(s). For 
a number of disorders of the above tissues or cells, particularly of the gastrointestinal 
system, expression of this gene at significantly higher or lower levels may be routinely 
detected in certain tissues (e.g., cancerous and wounded tissues) or bodily fluids (e.g., 
serum, plasma, urine, synovial fluid or spinal fluid) or another tissue or cell sample 

15 taken from an individual having such a disorder, relative to the standard gene 

expression level, i.e., the expression level in healthy tissue or bodily fluid from an 
individual not having the disorder. Preferred epitopes include those comprising a 
sequence shown in SEQ ID NO: 249 as residues: Glu-93 to Ala-98, Gin- 150 to Leu- 
156, Leu-220 to Leu-231, Leu-268 to Arg-273, VaI-324 to Pro-341, Arg-372 to Asn- 

20 380, Ser-405 to Gly-410, Phe-426 to Ala-433, Giu-458 to Asp-470, Arg-506 to Ser- 
547. 

The tissue distribution and homology to RNA helicase indicates that 
polynucleotides and polypeptides corresponding to this gene are useful for development 
of diagnostic tests for colon cancer. 

25 

FEATURES OF PROTEIN ENCODED BY GENE NO: 17 

The translation product of this contig has sequence homology to a cytoplasmic 
protein that binds specifically to JNK designated the JNK interacting protein- 1 or JIP-1 
in mice. JIP-1 caused cytoplasmic retention of JNK and inhibition of JNK-regulated 
30 gene expression. 

This gene is expressed primarily in brain including pituitary cerebellum frontal 
cortex, fetal brain and to a lesser extent in the kidney cortex. 

Therefore, polynucleotides and polypeptides of the invention are useful as 
reagents for differential identification of the tissue(s) or cell type(s) present in a 
35 biological sample and for diagnosis of the central nervous system disorders including 
ischemia, epilepsy, Parkinson's disease, and schizophrenia. Similarly, polypeptides 
and antibodies directed to these polypeptides are useful in providing immunological 
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probes for differential identification of the tissue* s) or cell type(s). For a number of 
disorders of the above tissues or cells, particularly of the central nervous system, 
expression of this gene at significantly higher or lower levels may be routinely detected 
in certain tissues (e.g., cancerous and wounded tissues) or bodily fluids (e.g., serum, 
5 plasma, urine, synovial fluid or spinal fluid) or another tissue or cell sample taken from 
an individual having such a disorder, relative to the standard gene expression level, i.e., 
the expression level in healthy tissue or bodily fluid from an individual not having the 
disorder. Furthermore, the translation product of this contig may suppress the effects of 
the JNK signaling pathway on cellular proliferation, including transformation by the 
10 Bcr-Abl oncogene. Preferred epitopes include those comprising a sequence shown in 
SEQ ID NO: 250 as residues: Pro-6 to Ser-26, Ala-30 to Asp-41 , Gly-55 to Ser-61, 
Gly-74 to Thr-80, Tyr-1 17 to Ala-123, Tyr-167 to Asp-172, Ala-212 to Cys-223, Pro- 
239 to Tyr-244. 

The tissue distribution indicates that polynucleotides and polypeptides 
15 corresponding to this gene are useful for enhanced survival and/or differentiation of 
neurons as a treatment for neurodegenerative disease. 

FEATURES OF PROTEIN ENCODED BY GENE NO: 18 

The translation product of this gene shares sequence homology with a liver 

20 stage antigen from a protozoan parasite. 

This gene is expressed primarily in fetal tissue and to a lesser extent in activated 
T-cells and other immune cells. 

Therefore, polynucleotides and polypeptides of the invention are useful as 
reagents for differential identification of the tissue(s) or cell type(s) present in a 

25 biological sample and for diagnosis of diseases and conditions, which include, but are 
not limited to, developmental abnormalities and diseases of immune function. Similarly, 
polypeptides and antibodies directed to these polypeptides are useful in providing 
immunological probes for differential identification of the tissue(s) or cell type(s). For 
a number of disorders of the above tissues or cells, particularly of the immune system, 

30 expression of this gene at significantly higher or lower levels may be routinely detected 
in certain tissues (e.g., cancerous and wounded tissues) or bodily fluids (e.g., serum, 
plasma, urine, synovial fluid or spinal fluid) or another tissue or cell sample taken from 
an individual having such a disorder, relative to the standard gene expression level, i.e., 
the expression level in healthy tissue or bodily fluid from an individual not having the 

35 disorder. 
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The tissue distribution and homology to a protozoan antigen indicates that 
polynucleotides and polypeptides corresponding to this gene are useful for 
treatment/immune modulation of parasitic infections. 

5 FEATURES OF PROTEIN ENCODED BY GENE NO: 19 

Preferred polypeptide encoded by this gene comprise the following polypeptide 
sequences: 

MKAIGIEPSLATYHHIIRLFDQPGDPLKRSSF1IYDIMNELMGKRFSPKD 
PDDDKFFQSAMSICSSLRDLELAYQVHGLLKTGDNWKHGPDQHRNFYYSKFF 

10 DLICLMEQIDVTLKWYEDL1PSAYFPHSQTMIHLLQALDVANRLEVIPKIWER 
(SEQ ID NO:460); and/or KDSKEYGHTFRSDLREEILMLMARDKHPPELQVAF 
ADCAADIKSAYESQPIRQTAQDWPATSLNCIAILFLRAGRTQEAWKMLGLFRKH 
NKJPRSELLNELMDSAKVSNSPSQAIEVVELASAFSLPICEGLTQRVMSDFAENQ 
EQKEALSNLTALTSDSDTDSSSDSDSDTSEGK (SEQ ID NO:461). Polynucleotides 

15 encoding such polypeptides are also provided. 

This gene is expressed primarily in stromal and CD34 depleted bone marrow 
cells and to a lesser extent in tissues of embryonic origin. 

Therefore, polynucleotides and polypeptides of the invention are useful as 
reagents for differential identification of the tissue(s) or cell type(s) present in a 

20 biological sample and for diagnosis of diseases and conditions, which include, but are 
not limited to, diseases of hematologic origin including cancers and immune 
dysfunction. Similarly, polypeptides and antibodies directed to these polypeptides arc 
useful in providing immunological probes for differential identification of the tissue(s) 
or cell type(s). For a number of disorders of the above tissues or cells, particularly of 

25 the hematapoietic and immune, expression of this gene at significantly higher or lower 
levels may be routinely detected in certain tissues (e.g., cancerous and wounded 
tissues) or bodily fluids (e.g., serum, plasma, urine, synovial fluid or spinal fluid) or 
another tissue or cell sample taken from an individual having such a disorder, relative to 
the standard gene expression level, i.e., the expression level in healthy tissue or bodily 

30 fluid from an individual not having the disorder. Preferred epitopes include those 
comprising a sequence shown in SEQ ID NO: 252 as residues: Ser-28 to Gln-34. 

The tissue distribution indicates that polynucleotides and polypeptides 
corresponding to this gene are useful as a growth factor for hematopoietic stem cells or 
progenitor cells which may be useful in the treatment of chemotherapy patients 

35 suffering from neutropenia. 
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FEATURES OF PROTEIN ENCODED BY GENE NO: 20 

Preferred polypeptide fragments can be found in an alternative open reading 
frame. These preferred polypeptides comprise the amino acid sequence: 
MSSDNESDEDEDLKLELRRLRDKHLKEIQDLQSRQKHEIESLYTKLGKVPPAVI 
5 IPPAAPLSGRRRRPTKSKGSKSSRSSSLGNKSPQLSGNLSGQSAASVLHPQQTL 
HPPGNIPESGQNQLLQPLKPSPSSDNLYSAFTSDGAISVPSLSAPGQGTSSTNTV 
GATVNSQAAQAQPPAMTSSRKGTFTDDLHKLVDNWARDAMNLSGRRGSKGH 
MNYEGPGMARKFSAPGQLCISMTSNLGGSAPISAASATSLGHFTKSMCPPQQY 
GFPATPFGAQWSGTGGPAPQPLGQFQPVGTASLQNFNISNLQKSISNPPGSNL 

10 RTT (SEQ ID NO:462); IQDLQSRQKHEIESLYTKLGKVPPAVIIPPAAPLSGRRRR 
PTKSKGSKSSRSSSLGNKSPQLSGNLSGQSAASVLHPQQTLHPPGNIPESGQN 
QLLQPLKPSPSSDNLYSAFTSDGA1SVPSLSAPGQGTSST (SEQ ID NO:463); 
TSDGAISVPSLSAPGQGTSSTNTVGATVNSQAAQAQPPAMTSSRKGTFTDDLH 
(SEQ ID NO:464); KGHMNYEGPGMARKFSAPGQLCISMTSNLGGSAPISAAS 

15 ATSLGHFTK (SEQ ID NO:465); QPLKPSPSSDNLYSAFTSDGAISVPSLSAPG 
(SEQ ID NO:466). Also preferred are polynucleotide fragments encoding these 
polypeptide fragments. 

This gene is expressed in fetal liver and tissues associated with the CNS. 
Therefore, polynucleotides and polypeptides of the invention are useful as 

20 reagents for differential identification of the tissue(s) or cell type(s) present in a 

biological sample and for diagnosis of diseases and conditions, which include, but are 
not limited to, liver and CNS diseases. Similarly, polypeptides and antibodies directed 
to these polypeptides are useful in providing immunological probes for differential 
identification of the tissue(s) or cell type(s). For a number of disorders of the above 

25 tissues or cells, particularly of the liver and CNS, expression of this gene at 

significantly higher or lower levels may be routinely detected in certain tissues (e.g., 
cancerous and wounded tissues) or bodily fluids (e.g., serum, plasma, urine, synovial 
fluid or spinal fluid) or another tissue or cell sample taken from an individual having 
such a disorder, relative to the standard gene expression level, i.e., the expression level 

30 in healthy tissue or bodily fluid from an individual not having the disorder. Preferred 
epitopes include those comprising a sequence shown in SEQ ID NO: 253 as residues: 
Gln-26 to Lys-34. 

The tissue distribution indicates that polynucleotides and polypeptides 
corresponding to this gene are useful for diagnosis and treatment for liver diseases such 

35 as hepatocellular carcinomas and diseases of the CNS. 
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FEATURES OF PROTEIN ENCODED BY GENE NO: 21 

In an alternative reading frame, this gene shows sequence homology to two 
recently cloned genes, karyopherin beta 3 and Ran_GTP binding protein 5. (See 
Accession Nos. gil2 102696 and gnllPIDIe32873 1 .) The Ran_GTP binding protein is 
5 related to importin-beta. the key mediator of nuclear localization signal (NLS)- 

dependent nuclear transport. Based on homology, it is likely that this gene may activity 
similar to the RAN_GTP binding protein. Preferred polypeptide fragments comprise the 
amino acid sequence: VRVAAAESMXLLLECAXVRGPEYLTQMWHFMCDALIKA 
IGTEPDSDVLSELMHSFAK (SEQ ID NO:467). Also preferred are polynucleotide 
10 fragments encoding these polypeptide fragments. 

This gene is expressed in thymus tissue. 

Therefore, polynucleotides and polypeptides of the invention are useful as 
reagents for differential identification of the tissue(s) or cell type(s) present in a 
biological sample and for diagnosis of diseases and conditions, which include, but are 

1 5 not limited to, immune disorders. Similarly, polypeptides and antibodies directed to 
these polypeptides are useful in providing immunological probes for differential 
identification of the tissue(s) or cell type(s). For a number of disorders of the above 
tissues or cells, particularly of the immune system, expression of this gene at 
significantly higher or lower levels may be routinely detected in certain tissues (e.g., 

20 cancerous and wounded tissues) or bodily fluids (e.g., serum, plasma, urine, synovial 
fluid or spinal fluid) or another tissue or cell sample taken from an individual having 
such a disorder, relative to the standard gene expression level, i.e., the expression level 
in healthy tissue or bodily fluid from an individual not having the disorder. 

The tissue distribution indicates that polynucleotides and polypeptides 

25 corresponding to this gene are useful for diagnosis and treatment for immune disorders. 

FEATURES OF PROTEIN ENCODED BY GENE NO: 22 

This gene is expressed primarily in prostate and osteoclastoma tissues. 

Preferred polypeptide fragments also comprise the amino acid sequence: 
30 MEINNQNCnVIDLVRTVMENGVEGLLIFGAFLPESWLIGVRCSSEPPKALLLIL 

AHSQKRRLDGWSFIRHLRVHYCVSLTIHFS (SEQ ID NO:468). Also preferred are 

polynucleotide sequences encoding this polypeptide fragment. 

Therefore, polynucleotides and polypeptides of the invention are useful as 

reagents for differential identification of the tissue(s) or cell type(s) present in a 
35 biological sample and for diagnosis of diseases and conditions, which include, but are 

not limited to, bone and prostate diseases, and cancers, particularly of the bone and 

prostate. Similarly, polypeptides and antibodies directed to these polypeptides are 
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useful in providing immunological probes for differential identification of the tissuc(s) 
or cell type(s). For a number of disorders of the above tissues or cells, particularly of 
the bone and prostate systems, expression of this gene at significantly higher or lower 
levels may be routinely detected in certain tissues (e.g., cancerous and wounded 
5 tissues) or bodily fluids (e.g., serum, plasma, urine, synovial fluid or spinal fluid) or 
another tissue or cell sample taken from an individual having such a disorder, relative to 
the standard gene expression level, i.e., the expression level in healthy tissue or bodily 
fluid from an individual not having the disorder. Preferred epitopes include those 
comprising a sequence shown in SEQ ID NO: 255 as residues: Met-1 to Ser-1 1. 
10 The tissue distribution indicates that polynucleotides and polypeptides 

corresponding to this gene are useful for diagnosis and treatment for bone and prostate 
disorders, especially cancers of those systems. 

FEATURES OF PROTEIN ENCODED BY GENE NO: 23 

15 This gene shares sequence homology with the FK506-binding protein (FKJBP- 

13) family, a known cytosolic receptor for the immunosuppressants. Recently, another 
group has cloned a very similar gene, recognizing the homology to FK506-binding 
protein family, calling their gene FKBP23. (See Accession No. 2827255.) 
This gene is expressed primarily in lymphoid tissues. 

20 Therefore, polynucleotides and polypeptides of the invention are useful as 

reagents for differential identification of the tissue(s) or cell type(s) present in a 
biological sample, especially for those susceptible to immune suppressant therapies and 
for diagnosis of diseases and conditions, which include, but are not limited to, immune 
suppressant disorders. Similarly, polypeptides and antibodies directed to these 

25 polypeptides are useful in providing immunological probes for differential identification 
of the tissue(s) or cell type(s). For a number of disorders of the above tissues or cells, 
particularly of the immune system, expression of this gene at significantly higher or 
lower levels may be routinely detected in certain tissues (e.g., cancerous and wounded 
tissues) or bodily fluids (e.g., serum, plasma, urine, synovial fluid or spinal fluid) or 

30 another tissue or cell sample taken from an individual having such a disorder, relative to 
the standard gene expression level, i.e., the expression level in healthy tissue or bodily 
fluid from an individual not having the disorder. Preferred epitopes include those 
comprising a sequence shown in SEQ ID NO: 256 as residues: Ala- 19 to Val-31, Arg- 
38 to Gly-49, Ala-6 1 to Lys-66, Tyr-68 to Pro-78, Gly- 1 1 6 to Ala- 1 2 1 , Asp- 1 54 to 

35 Ser-162, Glu-173 to Gin- 186, Phe-194 to Gly-203, Pro-207 to Val-212. 
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The tissue distribution and homology to FKBP-12 and -13 indicates that 
polynucleotides and polypeptides corresponding to this gene are useful for diagnosis 
and treatment for immune suppressant disorders. 

5 FEATURES OF PROTEIN ENCODED BY GENE NO: 24 

This gene is expressed primarily in the brain and in the retina. This gene maps 
to chromosome 8, and therefore can be used in linkage analysis as a marker for 
chromosome 8. 



10 reagents for differential identification of the tissue(s) or cell type(s) present in a 

biological sample and for diagnosis of diseases and conditions, which include, but are 
not limited to, neurological and ocular associated disease states. Similarly, polypeptides 
and antibodies directed to these polypeptides are useful in providing immunological 
probes for differential identification of the tissue(s) or cell type(s). For a number of 

1 5 disorders of the above tissues or cells, particularly of the disorders of the central 

nervous system, expression of this gene at significantly higher or lower levels may be 
routinely detected in certain tissues (e.g., cancerous and wounded tissues) or bodily 
fluids (e.g., serum, plasma, urine, synovial fluid or spinal fluid) or another tissue or 
cell sample taken from an individual having such a disorder, relative to the standard 

20 gene expression level, i.e., the expression level in healthy tissue or bodily fluid from an 
individual not having the disorder. Preferred epitopes include those comprising a 
sequence shown in SEQ ID NO: 257 as residues: Cys-34 to Asp-40. 

The tissue distribution in retina indicates that polynucleotides and polypeptides 
corresponding to this gene are useful for the treatment and/or detection of eye disorders 

25 including blindness, color blindness, impaired vision, short and long sightedness, 
retinitis pigmentosa, retinitis proliferans, and retinoblastoma. Expression in the brain 
indicates a role in the is useful for the detection/treatment of neurodegenerative disease 
states and behavioral disorders such as Alzheimer's Disease, Parkinson's Disease, 
Huntington's Disease, schizophrenia, mania, dementia, paranoia, obsessive compulsive 

30 disorder and panic disorder. 

FEATURES OF PROTEIN ENCODED BY GENE NO: 25 

This gene shows sequence homology to a newly identified class of proteins 
expressed in the nervous system, called stathmin family. (See Accession No. 2585991; 
35 see also Eur. J. Biochem. 248 (3), 794-806 (1997).) The stathmin family appears to be 
an ubiquitous phosphoprotein involved as a relay integrating various intracellular 
signaling pathways. These pathways affect cell proliferation and differentiation. 



Therefore, polynucleotides and polypeptides of the invention are useful as 
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Preferred polypeptide fragments comprise the amino acid sequence: 
QDKHAEEVRKNKELKEEASR (SEQ ID NO:469); QQDLSPWAAPVGCPLXXASX 

TCHXLPLSGCLRRQSXSLPVVAXLCFWFSCPLASLFVPGQPCVTCPFPSLPFQD 
KHAEEVRKNKELKEEASR (SEQ ID NO:470). Also preferred are the 
5 polynucleotide fragments encoding these polypeptide fragments. 
This gene is expressed primarily in brain. 

Therefore, polynucleotides and polypeptides of the invention are useful as 
reagents for differential identification of the tissue(s) or cell type(s) present in a 
biological sample and for diagnosis of diseases and conditions, which include, but are 

10 not limited to, neurological disorders. Similarly, polypeptides and antibodies directed to 
these polypeptides are useful in providing immunological probes for differential 
identification of the tissue(s) or cell type(s). For a number of disorders of the above 
tissues or cells, particularly of the central nervous system, expression of this gene at 
significantly higher or lower levels may be routinely detected in certain tissues (e.g., 

15 cancerous and wounded tissues) or bodily fluids (e.g., serum, plasma, urine, synovial 
fluid or spinal fluid) or another tissue or cell sample taken from an individual having 
such a disorder, relative to the standard gene expression level, i.e., the expression level 
in healthy tissue or bodily fluid from an individual not having the disorder. 

The tissue distribution indicates that polynucleotides and polypeptides 

20 corresponding to this gene are useful for the detection/treatment of neurodegenerative 
disease states and behavioral disorders such as Alzheimer's Disease, Parkinson's 
Disease, Huntintons Disease, schizophrenia, mania, dementia, paranoia, obsessive 
compulsive disorder and panic disorder. 

25 FEATURES OF PROTEIN ENCODED BY GENE NO: 26 

The polynucleotide sequence of this gene contains a domain similar to a Flt3 
ligand peptide. Preferred polypeptide fragments comprise the amino acid sequence: 
PTRCCTTQPCRSSARRPCWVPMVPSPEGREXQPTCPS (SEQ ID NO:471). Thus, 
this gene may have activity as binding to Flt3 receptors, a process known to promote 
30 angiogenesis and/or lymphangiogenesis. 

This gene is expressed in human tonsil, and to a lesser extent in 
teratocarcinoma, placenta, colon carcinoma, and fetal kidney. 

Therefore, polynucleotides and polypeptides of the invention are useful as 
reagents for identification of the tissue(s) or cell type(s) present in a biological sample 
35 and for diagnosis of diseases and conditions, which include, but are not limited to, 
diseases of the tonsil, as well as cancers, such as colon, reproductive, and kidney 
cancers. Similarly, polypeptides and antibodies directed to these polypeptides are useful 
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in providing immunological probes for differential identification of the tissue(s) or cell 
type(s). For a number of disorders of the above tissues or cells, particularly of the 
tonsils, colon, reproductive organs, and kidneys, expression of this gene at 
significantly higher or lower levels may be routinely detected in certain tissues (e.g., 
5 cancerous and wounded tissues) or bodily fluids (e.g., serum, plasma, urine, synovial 
fluid or spinal fluid) or another tissue or cell sample taken from an individual having 
such a disorder, relative to the standard gene expression level, i.e., the expression level 
in healthy tissue or bodily fluid from an individual not having the disorder. Preferred 
epitopes include those comprising a sequence shown in SEQ ID NO: 259 as residues: 

10 Pro-22 to Glu-33. 

The tissue distribution in tonsil and several cancers and fetal tissues indicates 
that polynucleotides and polypeptides corresponding to this gene are useful for 
diagnosis and treatment of diseases of the tonsil or colon, such as tonsillitis, 
inflammatory diseases involving nose and paranasal sinuses, especially during the 

15 infection of influenza, adenoviruses, parainfluenza, rhinoviruses. The gene may also be 
useful in the diagnosis and treatment of neoplasms of nasopharynx or colon origins. 

FEATURES OF PROTEIN ENCODED BY GENE NO: 27 

In an alternative reading frame exists a large open reading frame that encodes a 
20 preferred polypeptide. Preferred polypeptide fragments comprise the amino acid 
sequence: 

MKRSLNENSARSTAGCLPVPLFNQKKRNRQPLTSNPLKDDSGISTPSDNYDFP 
PLPTDWAWEAVNPEXAPVMKTVDTGQIPHSVSRPLRSQDSVFNSIQSNTGRSQ 
GGWSYRDGNKNTSLKTWXKNDFKPQCKRTNLVANDGKNSCPMSSGAQQQK 
25 QLRTPEPPNLSRNKETELLRQTHSSKISGCTMRGLDKNSALQTLKPNFQQNQY 
K^QMLDDIPEDNTLKETSLYQLQFK^ 

FEVLAVLDSAVTPGPYYSKTr^MRDGKNTLPCVFYEIDRELPRLIRGRVHRCVG 
NYDQKKNIFC^VSVRPASVSEQKTFQAFVKiADVEMQYYINVMNET (SEQ ID 
NO:472); SQDSVFNSIQSNTGRSQGGWSYRDGNKNTSLKTWXKNDFKPQCKR 

30 (SEQ ID NO:473); NKETELLRQTHSSKISGCTMRGLDKNSALQTLKPNF (SEQ ID 
NO:474);SSLRUSAVIESMKYWREHAQKTVLLreVLAVLDSAVTPGPYYSKTFLM 
(SEQ ID NO:475); and PRLIRGRVHRCVGNYDQKKNIFQCVSVRPASVSEQKT 
FQAFV (SEQ ID NO:476). 

This gene is expressed primarily in human testes. 

35 Therefore, polynucleotides and polypeptides of the invention are useful as 

reagents for differential identification of the tissue(s) or cell type(s) present in a 
biological sample and for diagnosis of diseases and conditions, which include, but are 
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not limited to, male reproductive disorders, including cancer. Similarly, polypeptides 
and antibodies directed to these polypeptides are useful in providing immunological 
probes for differential identification of the tissue(s) or cell type(s). For a number of 
disorders of the above tissues or cells, particularly of the male reproductive system, 
5 expression of this gene at significantly higher or lower levels may be routinely detected 
in certain tissues (e.g., cancerous and wounded tissues) or bodily fluids (e.g., serum, 
plasma, urine, synovial fluid or spinal fluid) or another tissue or cell sample taken from 
an individual having such a disorder, relative to the standard gene expression level, i.e., 
the expression level in healthy tissue or bodily fluid from an individual not having the 
10 disorder. 

The tissue distribution indicates that polynucleotides and polypeptides 
corresponding to this gene are useful as a hormone with reproductive or other systemic 
functions; contraceptive development; male infertility of testicular causes, such as 
Kleinfeltens syndrome, varicocele, orchitis; male sexual dysfunctions; testicular 
15 neoplasms; and inflammatory disorders such as epididymitis. 

FEATURES OF PROTEIN ENCODED BY GENE NO: 28 

This gene is expressed primarily in apoptotic T-cell. 

Therefore, polynucleotides and polypeptides of the invention are useful as 

20 reagents for differential identification of the tissue(s) or cell type(s) present in a 

biological sample and for diagnosis of diseases and conditions, which include, but are 
not limited to, diseases relating to T cells, as well as cancer in general. Similarly, 
polypeptides and antibodies directed to these polypeptides are useful in providing 
immunological probes for differential identification of the tissue(s) or cell type(s). For 

25 a number of disorders of the above tissues or cells, particularly of the disorders of the 
immune system, expression of this gene at significantly higher or lower levels may be 
routinely detected in certain tissues (e.g., cancerous and wounded tissues) or bodily 
fluids (e.g., serum, plasma, urine, synovial fluid or spinal fluid) or another tissue or 
cell sample taken from an individual having such a disorder, relative to the standard 

30 gene expression level, i.e., the expression level in healthy tissue or bodily fluid from an 
individual not having the disorder. 

The tissue distribution indicates that polynucleotides and polypeptides 
corresponding to this gene are useful for immune disorders. Moreover, since the gene 
was isolated from an apoptotic cell and based on the understanding of the relationship 

35 of apoptosis and cancer, it is likely that this gene may play a role in the genesis of 
cancer. 



WO 98/54963 




FEATURES OF PROTEIN ENCODED BY GENE NO: 29 

This gene is expressed primarily in human tonsils. 

Therefore, polynucleotides and polypeptides of the invention are useful as 
5 reagents for differential identification of the tissuei s) or cell typc(s) present in a 

biological sample and for diagnosis of diseases and conditions, which include, but are 
not limited to, gastrointestinal disorders. Similarly, polypeptides and antibodies directed 
to these polypeptides are useful in providing immunological probes for differential 
identification of the tissue(s) or cell type(s). For a number of disorders of the above 

10 tissues or cells, particularly of the gastrointestinal system, expression of this gene at 
significantly higher or lower levels may be routinely detected in certain tissues (e.g., 
cancerous and wounded tissues) or bodily fluids (e.g., serum, plasma, urine, synovial 
fluid or spinal fluid) or another tissue or cell sample taken from an individual having 
such a disorder, relative to the standard gene expression level, i.e., the expression level 

15 in healthy tissue or bodily fluid from an individual not having the disorder. 

The tissue distribution indicates that polynucleotides and polypeptides 
corresponding to this gene are useful for the diagnosis and treatment of gastrointestinal 
diseases. 



20 FEATURES OF PROTEIN ENCODED BY GENE NO: 30 

The translation product of this gene shares sequence homology with C44C1.2 
gene product of Caenorhabditis elegans with unknown function. Preferred polypeptide 
fragments comprise the amino acid sequence: 

GVFTIPCVCGRPASLTCSPLDPEVGPYCDTPTMRTLFNLLWLALACSPVHTTLSK 
25 SDAKKAASKTLLEKSQFSDKPVQDRGLVVTDLKAESVVLEHRSYCSAKARDRH 
FAGDVLGYVTPWNSHGYDVTKVFGSKJ^QISPVWLQLKJ^RGREMFEVTGLHD 
VD(^WMRAVRKHAKGLHIVPRLLFEDWTYDDFRNVLDSEDEIEELSKTVVQVA 
KNQHTOGFVVEVWNQLIJSQK^ 

DQLGMFTHKEFEQLAPVLDGFSLMTYDYSTAHQPGPNAPLSWVRACVQVLDP 
30 KXKWRTKSSWGSTSMXWTXRXPXDARXPVVGXRXIQXLKDHXPRMVLDSK 
PQ (SEQ ID NO:477); TCSPLDPEVGPYCDTPTMRTLFNLLWLALACSPVHTTLS 
(SEQ ID NO:478); LVVTDLKAESVVLEHRSYCSAKARDRHFAGDVLGYVTPW 
NSHGYDVTKVFGSKF (SEQ ID NO:479); REMFEVTGLHDVDQGWMRAVRK 
HAKGLHIVPRLLFEDWTYDDFRNVLDSEDE (SEQ ID NO:480); HFDGFVVEVW 
35 NQLI^QKRVGLIHMLTHLAEALHQARLLALLVIPPAITPGTDQLGM (SEQ ID 
NO:481 ); DGFSLMTYDYSTAHQPGPNAPLSWVRACVQVLDPKXKWRTKSSW 
GST (SEQ ID NO:482). Also preferred are polynucleotide fragments encoding these 
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polypeptide fragments. This gene maps to human chromosome 1 1, and therefore is 
useful in linkage analysis as a marker for chromosome 1 1 . 

Thi s gene is expressed primarily in human I cells and to a lesser extent in 
human colon carcinoma. 
5 Therefore, polynucleotides and polypeptides of the invention are useful as 

reagents for differential identification of the tissue(s) or cell type(s) present in a 
biological sample and for diagnosis of diseases and conditions, which include, but are 
not limited to, immune disorders and cancer. Similarly, polypeptides and antibodies 
directed to these polypeptides are useful in providing immunological probes for 

10 differential identification of the tissue( s) or cell type(s). For a number of disorders of 
the above tissues or cells, particularly of the immune and gastrointestinal systems, 
expression of this gene at significantly higher or lower levels may be routinely detected 
in certain tissues (e.g., cancerous and wounded tissues) or bodily fluids (e.g., serum, 
plasma, urine, synovial fluid or spinal fluid) or another tissue or cell sample taken from 

15 an individual having such a disorder, relative to the standard gene expression level, i.e., 
the expression level in healthy tissue or bodily fluid from an individual not having the 
disorder. Preferred epitopes include those comprising a sequence shown in SEQ ID 
NO: 263 as residues: Leu-21 to Ala-30, Ser-38 to Asp-47, Pro-87 to Asp-94, Leu- 197 
to Thr-204, Pro-256 to Ser-262, Thr-277 to Arg-282, Thr-293 to Trp-303. 

20 The tissue distribution indicates that polynucleotides and polypeptides 

corresponding to this gene are useful for the diagnosis and treatment of immune 
disorders and gastrointestinal diseases. 

FEATURES OF PROTEIN ENCODED BY GENE NO: 31 

25 The translation product of this gene shares sequence homology with Ribosomal 

protein Lll of Caenorhabditis elegans. (See Accession No. 156201.) Preferred 
polypeptide fragments comprise the amino acid sequence: 
ERGVSINQFCKEFNERTKDIKEGIPLCT 

IEKGARQTGKEVAGLVTLKHVYEIARIKAQDEAFALQDVPLSSVVRSnGSARSL 
30 GIRVVKDLSSEELAAF QKERAIFL A AQKE ADL AAQEE AA KK (SEQ ID NO:483). 
Also preferred are polynucleotide fragments encoding these polypeptide fragments. 

This gene is expressed in human embryo tissue and to a lesser extent in human 
epithelioid sarcoma and other tissues. 

Therefore, polynucleotides and polypeptides of the invention are useful as 
35 reagents for identification of the tissue(s) or cell type(s) present in a biological sample 
and for diagnosis of diseases and conditions, which include, but are not limited to, 
development disorders and epithelial cell cancer. Similarly, polypeptides and antibodies 
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directed to these polypeptides are useful in providing immunological probes for 
differential identification of the tissue(s) or cell type(s). For a number of disorders of 
the above tissues or cells, particularly of the embryonic and epithelial cell systems, 
expression of this gene at significantly higher or lower levels may be routinely detected 
5 in certain tissues (e.g., cancerous and wounded tissues) or bodily fluids (e.g., serum, 
plasma, urine, synovial fluid or spinal fluid) or another tissue or cell sample taken from 
an individual having such a disorder, relative to the standard gene expression level, i.e., 
the expression level in healthy tissue or bodily fluid from an individual not having the 
disorder. Preferred epitopes include those comprising a sequence shown in SEQ ID 
10 NO: 264 as residues: Lys-34 to Gly-40. 

The tissue distribution indicates that polynucleotides and polypeptides 
corresponding to this gene are useful for the diagnosis and treatment of developmental 
disorders and epithelial cancer. 

15 FEATURES OF PROTEIN ENCODED BY GENE NO: 32 

This gene is expressed primarily in resting T cells. 

Therefore, polynucleotides and polypeptides of the invention are useful as 
reagents for differential identification of the tissue(s) or cell type(s) present in a 
biological sample and for diagnosis of diseases and conditions, which include, but are 

20 not limited to, inflammatory and general immune disorders. Similarly, polypeptides and 
antibodies directed to these polypeptides are useful in providing immunological probes 
for differential identification of the tissue(s) or cell type(s). For a number of disorders 
of the above tissues or cells, particularly of the immune system, expression of this gene 
at significantly higher or lower levels may be routinely detected in certain tissues (e.g., 

25 cancerous and wounded tissues) or bodily fluids (e.g., serum, plasma, urine, synovial 
fluid or spinal fluid) or another tissue or cell sample taken from an individual having 
such a disorder, relative to the standard gene expression level, i.e., the expression level 
in healthy tissue or bodily fluid from an individual not having the disorder. 



30 corresponding to this gene are useful for the diagnosis and treatment of disorders of 
immune system. 

FEATURES OF PROTEIN ENCODED BY GENE NO: 33 

This gene is believed to reside on chromosome 1 . Accordingly, polynucleotides 
35 derived from this gene are useful in linkage analysis as chromosome 1 markers. 

This gene is expressed primarily in prostate and to a lesser extent in soares adult 
brain, human umbilical vein endothelial cells, and amniotic cells. 



The tissue distribution indicates that polynucleotides and polypeptides 
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Therefore, polynucleotides and polypeptides of the invention are useful as 
reagents for differential identification of the tissue(s) or cell type(s) present in a 
biological sample and for diagnosis of diseases and conditions, which include, but are 
not limited to, prostate-related disorders. Similarly, polypeptides and antibodies 
5 directed to these polypeptides are useful in providing immunological probes for 

differential identification of the tissue(s) or cell type(s). For a number of disorders of 
the above tissues or cells, particularly of the urinary system and nervous system 
expression of this gene at significantly higher or lower levels may be routinely detected 
in certain tissues (e.g., cancerous and wounded tissues) or bodily fluids (e.g., serum, 
10 plasma, urine, synovial fluid or spinal fluid) or another tissue or cell sample taken from 
an individual having such a disorder, relative to the standard gene expression level, i.e., 
the expression level in healthy tissue or bodily fluid from an individual not having the 
disorder. 

The tissue distribution indicates that the protein products of this gene are useful 
15 for the diagnosis and treatment of disorders of the urinary and nervous systems. 

FEATURES OF PROTEIN ENCODED BY GENE NO: 34 

This gene shares sequence homology with R05G6.4 gene product. (See Accession No. 
gill 326338.) This gene also shares sequence homology with the cyclophilinTike protein 
20 CyP-60. (See Accession No. 1 199598, see also Biochem. J. 314 (1), 313-319 
(1996).) Preferred polypeptide fragments comprise the amino acid sequence: 
AVYTYHEKKJsT>TAASGYGTQNIRL^ 
YEREAILEYILHQKJ<EIARQMK^ 

SAIVSRP LNPFTAKALSGTSPDDVQPGPSVGPPSKDKDKVLPSFWIPSLTPEAK 
25 ATKLEKPSRTVTCPMSGKPLRMSDLTPVHFTPLDSSVDRVGLITRSERYVCAVT 
RDSLSNATPCAVLRPSGAWTLECVEmRKDMVDPVTGDKLTDRDIIVLQRGT 
(SEQ ED NO:484); YLYEREAILEYILHQKKEIARQMKAYEKQRGTRREEQKHLQ 
RA AS QDH VRGFLE (SEQ ID NO:485); and FTAKALSGTSPDDVQPGPSVGPP 
SKDKDKVLPSFWIPSLTPEAKATKLEKPSRTVTCPMSGKPL (SEQ ID NO:486). 
30 Also preferred are polynucleotide fragments that encode these polypeptide fragments. 

This gene is expressed primarily in human testis and to a lesser extent in other 

tissues. 

Therefore, polynucleotides and polypeptides of the invention are useful as 
reagents for differential identification of the tissue(s) or cell type(s) present in a 
35 biological sample and for diagnosis of diseases and conditions, which include, but are 
not limited to, male reproductive disorders and in particular testicular cancer. Similarly, 
polypeptides and antibodies directed to these polypeptides are useful in providing 
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immunological probes for differential identification of the tissue(s) or cell type(s). For 
a number of disorders of the above tissues or cells, particularly of the immune system. 
Expression of this gene at significantly higher or lower levels may be routinely detected 
in certain tissues (e.g., cancerous and wounded tissues) or bodily fluids (e.g.. serum, 
5 plasma, urine, synovial fluid or spinal fluid) or another tissue or cell sample taken from 
an individual having such a disorder, relative to the standard gene expression level, i.e., 
the expression level in healthy tissue or bodily fluid from an individual not having the 
disorder. 

The tissue distribution indicates that polynucleotides and polypeptides 
10 corresponding to this gene are useful for diagnosis and treatment of disorders of the 
male reproductive system and in particular of testicular cancer. 

FEATURES OF PROTEIN ENCODED BY GENE NO: 35 

The translation product of this gene shares sequence homology with Lpe5p of 
1 5 Saccharomyces cerevisiae which is thought to be important in the metabolism of 
phospholipids. 

This gene is expressed primarily in liver and brain. 

Therefore, polynucleotides and polypeptides of the invention are useful as 
reagents for differential identification of the tissue(s) or cell type(s) present in a 

20 biological sample and for diagnosis of diseases and conditions, which include, but are 
not limited to, metabolic disorders. Similarly, polypeptides and antibodies directed to 
these polypeptides are useful in providing immunological probes for differential 
identification of the tissue(s) or cell type(s). For a number of disorders of the above 
tissues or cells, particularly of the metabolic and nervous systems expression of this 

25 gene at significantly higher or lower levels may be routinely detected in certain tissues 
(e.g., cancerous and wounded tissues) or bodily fluids (e.g., serum, plasma, urine, 
synovial fluid or spinal fluid) or another tissue or cell sample taken from an individual 
having such a disorder, relative to the standard gene expression level, i.e., the 
expression level in healthy tissue or bodily fluid from an individual not having the 

30 disorder. Preferred epitopes include those comprising a sequence shown in SEQ ID 
NO: 268 as residues: Pro-14 to Leu-20, Lys-28 to Asn-38, Arg-109 to Arg-1 14, Lys- 
1 19 to Asn-124, Glu-152 to Leu- 157, Pro- 172 to Val-180. 

The tissue distribution and homology to Lpe5p of Saccharomyces cerevisiae 
indicates that polynucleotides and polypeptides corresponding to this gene are usefiil for 

35 the diagnosis and treatment of metabolic and nervous disorders. 
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FEATURES OF PROTEIN ENCODED BY GENE NO: 36 

This gene shares sequence homology with the nuclear nbonucieoprotein U (HNRNP 
U), encoded by C elegans (See Accession gil 1703576.) Preferred polypeptide 
fragments comprise the amino acid sequence: 
5 MDTSENRPENDVPEPPMPIADQVSNDDRPEGSVEDEEKKESSLPKSFKRKISVV 
SATKGVPAGNSDTEGGQPGRKRRWGASTATTQKKPSISITTESLKSLIPDIKPL 
AGQEAVVDLHADDSR1SEDETERNGDDGTHDKGLKJCRTVTQVVPAEGQENGQ 
REEEEEEKEPEAEPPVPPQVSVEVALPPPAEHEVKKVTLGDTLTRRSISQQKSGV 
SITIDDPVRTAQVPSPPRGKISNIVHISNLVRPFTLGQLKELLGRTGTLVEEAFWI 

10 DKJKSHCFVTYSTVEEAVATRTALHGVKWPQSNPKFLCADYAEQDELDYHRGL 
LVDRPSETKTEEQGIPRPLHPPPPPPVQPPQHPRAEQREQERAVREQWAERERE 
MERRERTRSEREWDRDKVREGPRSRSRSRXRRRKERAKSKEKKSEKKEKAQE 
EPPAKJ^LDDLPTIKTKAAPCIYWLPLTDSQIVQKEAERAERAK£REKRRKEQEEE 
EQKEREKEAERERNRQLEREKRREHSRERDRERERERERDRGDRDRDRERDRE 

1 5 RGRERDRRDTKRHSRSRSRSTPVRDRGGR (SEQ ID NO:488). Also preferred are 
the polynucleotide fragments encoding this polypeptide fragments. 
This gene is expressed primarily in epididymus. 

Therefore, polynucleotides and polypeptides of the invention are useful as 
reagents for differential identification of the tissue(s) or cell type(s) present in a 

20 biological sample and for diagnosis of diseases and conditions, which include, but are 
not limited to, diseases of the male reproductive system. Similarly, polypeptides and 
antibodies directed to these polypeptides are useful in providing immunological probes 
for differential identification of the tissue(s) or cell type(s). For a number of disorders 
of the above tissues or cells, particularly of the male reproductive system, expression of 

25 this gene at significantly higher or lower levels may be routinely detected in certain 
tissues (e.g., cancerous and wounded tissues) or bodily fluids (e.g., serum, plasma, 
urine, synovial fluid or spinal fluid) or another tissue or cell sample taken from an 
individual having such a disorder, relative to the standard gene expression level, i.e., 
the expression level in healthy tissue or bodily fluid from an individual not having the 

30 disorder. 

The tissue distribution indicates that polynucleotides and polypeptides 
corresponding to this gene are useful for the diagnosis and treatment of male 
reproductive disorders. 

35 FEATURES OF PROTEIN ENCODED BY GENE NO: 37 

This gene is expressed primarily in amygdala. 
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Therefore, polynucleotides and polypeptides of the invention are useful as 
reagents for differential identification of the tissue(s) or cell type(s) present in a 
biological sample and for diagnosis of diseases and conditions, which include, but are 
not limited to, inflammatory diseases and reproductive disorders. Similarly, 
5 polypeptides and antibodies directed to these polypeptides are useful in providing 

immunological probes for differential identification of the tissue(s) or cell type(s). For 
a number of disorders of the above tissues or cells, particularly of the amygdala, 
expression of this gene at significantly higher or lower levels may be routinely detected 
in certain tissues (e.g., cancerous and wounded tissues) or bodily fluids (e.g., serum, 
10 plasma, urine, synovial fluid or spinal fluid) or another tissue or cell sample taken from 
an individual having such a disorder, relative to the standard gene expression level, i.e., 
the expression level in healthy tissue or bodily fluid from an individual not having the 
disorder. 

The tissue distribution indicates that polynucleotides and polypeptides 
1 5 corresponding to this gene are useful for diagnosis and treatment of inflammatory 
diseases and reproductive disorders. 

FEATURES OF PROTEIN ENCODED BY GENE NO: 38 

This gene shares sequence homology with human opsonin protein P35 

20 fragment. (See Accession No. R94181.) The opsonin protein activates the phagocytosis 
of pathogenic microbes by phagocytic cells. Preferred polypeptide fragments comprise 
the amino acid sequence: GCDSCPPHLPREAFAQDTQAEGECSSRAERADMCPDAP 
PSQEVPEGPGAAP (SEQ ID NO:489). Also preferred are polynucleotide fragments 
encoding these polypeptide fragments. 

25 This gene is expressed in immune-related tissues such as thymus, macrophage, 

T cells and to a lesser extent in many other tissues. 

Therefore, polynucleotides and polypeptides of the invention are useful as 
reagents for differential identification of the tissue(s) or cell type(s) present in a 
biological sample and for diagnosis of diseases and conditions, which include, but are 

30 not limited to, immune disorders and infectious disease. Similarly, polypeptides and 
antibodies directed to these polypeptides are useful in providing immunological probes 
for differential identification of the tissue(s) or cell type(s). For a number of disorders 
of the above tissues or cells, particularly of the immune system and infectious disease, 
expression of this gene at significantly higher or lower levels may be routinely detected 

35 in certain tissues (e.g., cancerous and wounded tissues) or bodily fluids (e.g., serum, 
plasma, urine, synovial fluid or spinal fluid) or another tissue or cell sample taken from 
an individual having such a disorder, relative to the standard gene expression level, i.e.. 
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the expression level in healthy tissue or bodily fluid from an individual not having the 
disorder. Preferred epitopes include those comprising a sequence shown in SEQ ID 
NO: 271 as residues: Lys-9 to Arg-14, Met-38 to Asp-51. 

The tissue distribution indicates that polynucleotides and polypeptides 
5 corresponding to this gene are useful for diagnosis and treatment of immune disorders, 
as well as the treatment and/or diagnosis of infectious disease. 

FEATURES OF PROTEIN ENCODED BY GENE NO: 39 

The translation product of this gene shares sequence homology with alpha- 2 
10 type I collagen which is thought to be important in tissue repair. (See, e.g., 21 1607.) 
Preferred polypeptide fragments comprise the amino acid sequence: PQLPSCGRPW 
PGTASVFQSHTQGPREDPDPCRAQGSAGTHCPISLSPPRQ (SEQ ID NO:490). 
Also preferred are the polynucleotide sequences encoding these polypeptide sequences. 
This gene is expressed primarily in the brain and to a lesser extent in the kidney 
15 and thymus 

Therefore, polynucleotides and polypeptides of the invention are useful as 
reagents for differential identification of the tissue(s) or cell type(s) present in a 
biological sample and for diagnosis of diseases and conditions, which include, but are 
not limited to, brain, kidney, and immune disorders. Similarly, polypeptides and 

20 antibodies directed to these polypeptides are useful in providing immunological probes 
for differential identification of the tissue(s) or cell type(s). For a number of disorders 
of the above tissues or cells, particularly of the brain, kidney, and immune disorders, 
expression of this gene at significantly higher or lower levels may be routinely detected 
in certain tissues (e.g., cancerous and wounded tissues) or bodily fluids (e.g., serum, 

25 plasma, urine, synovial fluid or spinal fluid) or another tissue or cell sample taken from 
an individual having such a disorder, relative to the standard gene expression level, i.e., 
the expression level in healthy tissue or bodily fluid from an individual not having the 
disorder. 

The tissue distribution and homology to alpha-2 type I collagen indicates that 
30 polynucleotides and polypeptides corresponding to this gene are useful for diagnosis 
and treatment of tissue repair, and brain, kidney, immune disorders. 

FEATURES OF PROTEIN ENCODED BY GENE NO: 40 

The translation product of this gene shares sequence homology with mini- 
35 collagen which is thought to be important in tissue repair tumor metastasis. (See 
Accession No. gnllPIDId 1006976.) Preferred polypeptide fragments comprise the 
amino acid sequence: PGFRGPSGSLGCSFFPRSLGRVLPPGCQRPGAHAD 
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SSPPPTP (SEQ ID NO:491 ). Also preferred are polynucleotides encoding this 
polypeptide fragment. 

This gene is expressed in ovarian cancer and to a lesser extent in dedntic cells 
and smooth muscle. 

5 Therefore, polynucleotides and polypeptides of the invention are useful as 

reagents for differential identification of the tissue(s) or cell type(s) present in a 
biological sample and for diagnosis of diseases and conditions, which include, but are 
not limited to, tumor metastasis and tissue repair. Similarly, polypeptides and 
antibodies directed to these polypeptides are useful in providing immunological probes 

10 for differential identification of the tissue(s) or cell type(s). For a number of disorders 
of the above tissues or cells, particularly of the tumor metastasis and tissue repair, 
expression of this gene at significantly higher or lower levels may be routinely detected 
in certain tissues (e.g., cancerous and wounded tissues) or bodily fluids (e.g., serum, 
plasma, urine, synovial fluid or spinal fluid) or another tissue or cell sample taken from 

15 an individual having such a disorder, relative to the standard gene expression level, i.e.. 
the expression level in healthy tissue or bodily fluid from an individual not having the 
disorder. Preferred epitopes include those comprising a sequence shown in SEQ ID 
NO: 273 as residues: Asn-2 to His-1 1. 



20 polynucleotides and polypeptides corresponding to this gene are useful for diagnosis 
and treatment of tumor metastasis and tissue repair. 

FEATURES OF PROTEIN ENCODED BY GENE NO: 41 

This gene shares sequence homology with the HIV TAT protein. (See 
25 Accession No. 328416.) Preferred polypeptide fragments comprise the amino acid 
sequence: EDLKKPDPASLRAASCGEGKKRKACKNCTCGLAEELEKEK 
SREQMSSQPKSACGNCYLGDAFRCASCPYLGMPAFKPGEKVLLS (SEQ ID 
NO:492);EDLKKPDPASLRAASCGEGKKRKACKNCTCGLAEELEKEK 
SREQMSSQPKSACGNCYLGDAFRCASCPYLGMPAFKPGEKVLLSDSNLHD 
30 (SEQ ID NO:493); CGNCYLGDAFRCASCPYLGMPAFKPGEKVLLSDS 

(SEQ ID NO:494); SCGEGKKRKACKNCTCGLAEELEKE (SEQ ID NO:495); 
SQPKSAC GNCYLGDAFRCASC (SEQ ID NO:496); and REAGQNSERQYVS 
LSRD (SEQ ID NO:497). Also preferred are polynucleotide fragments encoding these 
polypeptide fragments. 

35 This gene is expressed primarily in the infant brain and to a lesser extent in the 

breast and testes. 



The tissue distribution and homology to mini-collegen gene indicates that 
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Therefore, polynucleotides and polypeptides of the invention are useful as 
reagents for differential identification of the tissue(s) or cell type(s) present in a 
biological sample and for diagnosis of diseases and conditions, which include, but are 
not limited to, brain, testes and breast disorders. Similarly, polypeptides and antibodies 
5 directed to these polypeptides are useful in providing immunological probes for 

differential identification of the tissue(s) or cell type(s). For a number of disorders of 
the above tissues or cells, particularly of the brain, testes and breast disorders, 
expression of this gene at significantly higher or lower levels may be routinely detected 
in certain tissues (e.g., cancerous and wounded tissues) or bodily fluids (e.g., serum, 

10 plasma, urine, synovial fluid or spinal fluid) or another tissue or cell sample taken from 
an individual having such a disorder, relative to the standard gene expression level, i.e., 
the expression level in healthy tissue or bodily fluid from an individual not having the 
disorder. Preferred epitopes include those comprising a sequence shown in SEQ ID 
NO: 274 as residues: Pro-7 to Val-15. 

1 5 The tissue distribution indicates that polynucleotides and polypeptides 

corresponding to this gene are useful for diagnosis and treatment of brain, testes and 
breast, and other related disorders. 

FEATURES OF PROTEIN ENCODED BY GENE NO: 42 

20 This gene is expressed primarily in the infant brain, human cerebellum, and to a 

lesser extent in medulloblastoma. 

Therefore, polynucleotides and polypeptides of the invention are useful as 
reagents for differential identification of the tissue(s) or cell type(s) present in a 
biological sample and for diagnosis of diseases and conditions, which include, but are 

25 not limited to, brain related disorders and medulloblastoma and other brain cancers. 
Similarly, polypeptides and antibodies directed to these polypeptides are useful in 
providing immunological probes for differential identification of the tissue(s) or cell 
type(s). For a number of disorders of the above tissues or cells, particularly of the 
brain related disorders and brain cancers, including medulloblastoma, expression of this 

30 gene at significantly higher or lower levels may be routinely detected in certain tissues 
(e.g., cancerous and wounded tissues) or bodily fluids (e.g., serum, plasma, urine, 
synovial fluid or spinal fluid) or another tissue or cell sample taken from an individual 
having such a disorder, relative to the standard gene expression level, i.e., the 
expression level in healthy tissue or bodily fluid from an individual not having the 

35 disorder. Preferred epitopes include those comprising a sequence shown in SEQ ID 
NO: 275 as residues: Thr-41 to Glu-47. 
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The tissue distribution indicates that polynucleotides and polypeptides 
corresponding to this gene are useful for diagnosis and treatment of human brain related 
disorders, brain cancers, and medulloblastoma. 



5 FEATURES OF PROTEIN ENCODED BY GENE NO: 43 

The translation product of this gene shares sequence homology with a 
phosphotyrosine-independent ligand for the lck SH2 domain which is thought to be 
important in signal transduction related to phosphotyrosine-independent ligand for the 
lck SH2 domain. (See Accession No. gill 184951.) Preferred polypeptide fragments 

1 0 comprise the amino acid sequence: ESSGQARTLADPGPGWPRQQGMCFGSLT 

GLSTTPHGFLTVSAEADPRI.mSLSQMLSMGFSDEGGWLTRLLQTKNYDIGAAL 
DT1QYSKH (SEQ ID NO:498). Also preferred are polynucleotide fragments encoding 
this polypeptide fragment. It is likely that this gene is a new member of a family of 
phosphotyrosine-independent ligands for the lck SH2 domains. 

1 5 This gene is expressed primarily in the placenta and to a lesser extent in 

endothelial cells and neutrophil. 

Therefore, polynucleotides and polypeptides of the invention are useful as 
reagents for differential identification of the tissue(s) or cell type(s) present in a 
biological sample and for diagnosis of diseases and conditions, which include, but are 

20 not limited to, reproductive, cardiovascular, immune, and infectious diseases. 

Similarly, polypeptides and antibodies directed to these polypeptides are useful in 
providing immunological probes for differential identification of the tissue(s) or cell 
type(s). For a number of disorders of the above tissues or cells, particularly of the 
cardiovascular, reproductive, and immune system, and infectious diseases, expression 

25 of this gene at significantly higher or lower levels may be routinely detected in certain 
tissues (e.g., cancerous and wounded tissues) or bodily fluids (e.g., serum, plasma, 
urine, synovial fluid or spinal fluid) or another tissue or cell sample taken from an 
individual having such a disorder, relative to the standard gene expression level, i.e., 
the expression level in healthy tissue or bodily fluid from an individual not having the 

30 disorder. 

The tissue distribution and homology to a phosphotyrosine-independent ligand 
for the lck SH2 domain indicates that polynucleotides and polypeptides corresponding 
to this gene are useful for diagnosis and treatment of cardiovascular, reproductive, and 
immune system diseases, as well as infectious diseases. 



35 
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FEATURES OF PROTEIN ENCODED BY GENE NO: 44 

This gene is expressed primarily in the fetal brain, cerebellum and to a lesser 
extent in the placenta. 

Therefore, polynucleotides and polypeptides of the invention are useful as 
5 reagents for differential identification of the lissue(s) or cell type(s) present in a 

biological sample and for diagnosis of diseases and conditions, which include, but arc 
not limited to, neuronal cell related disorders. Similarly, polypeptides and antibodies 
directed to these polypeptides are useful in providing immunological probes for 
differential identification of the tissue(s) or cell type(s). For a number of disorders of 
10 the above tissues or cells, particularly of the neuronal cell related disorders, expression 
of this gene at significantly higher or lower levels may be routinely detected in certain 
tissues (e.g., cancerous and wounded tissues) or bodily fluids (e.g., serum, plasma, 
urine, synovial fluid or spinal fluid) or another tissue or cell sample taken from an 
individual having such a disorder, relative to the standard gene expression level, i.e., 
15 the expression level in healthy tissue or bodily fluid from an individual not having the 
disorder. Preferred epitopes include those comprising a sequence shown in SEQ ID 
NO: 277 as residues: Thr-20 to Gly-28. 

The tissue distribution and homology to proline-rich protein genes indicates that 
polynucleotides and polypeptides corresponding to this gene are useful for diagnosis 
20 and treatment of neuronal cell related disorders. 

FEATURES OF PROTEIN ENCODED BY GENE NO: 45 

The translation product of this gene shares sequence homology with 
precerebellin of human, which is thought to be important in synaptic physiology. (See 

25 Accession No. gill 80251.) It has been observed that cerebellin-like immunoreactivity is 
associated with Purkinje cell postsynaptic structures. Thus, it is likely that this gene 
also have synaptic activity. Preferred polypeptide fragments comprise the amino acid 
sequence: QEGSEPVLLEGECLVVCEPGRAAAGGPGGAALGEAPPGRVAFXAV 
RSHHHEPAGETGNGTSGAIYFDQVLVNEGGGFDRASGSFVAPVRGVYSFRFH 

30 VVKVYNRQTVQVSLML^mVPVISAFANDPDVTREAATSSVLLPLDPGDRVSLR 
LRRGXSTGW (SEQ ID NO:499). Also preferred are polynucleotide fragments 
encoding these polypeptide fragments. 

This gene is expressed primarily in cerebellum and infant brain. By Northern 
analysis, a single transcript of 2.4 kb was observed in brain tissues. 

35 Therefore, polynucleotides and polypeptides of the invention are useful as 

reagents for differential identification of the tissue(s) or cell type(s) present in a 
biological sample and for diagnosis of diseases and conditions, which include, but are 
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not limited to. neuronal cell signal transduction and synaptic physiology. Similarly, 
polypeptides and antibodies directed to these polypeptides are useful in providing 
immunological probes for differential identification of the tissue(s) or cell type(s). For 
a number of disorders of the above tissues or cells, particularly of the neuronal cell 
5 signal transduction and synaptic physiology expression of this gene at significantly 
higher or lower levels may be routinely detected in certain tissues (e.g., cancerous and 
wounded tissues) or bodily fluids (e.g., serum, plasma, urine, synovial fluid or spinal 
fluid) or another tissue or cell sample taken from an individual having such a disorder, 
relative to the standard gene expression level, i.e., the expression level in healthy tissue 
10 or bodily fluid from an individual not having the disorder. 

The tissue distribution and homology to gene or gene family indicates that 
polynucleotides and polypeptides corresponding to this gene are useful for diagnosis 
and treatment of neuronal cell related disorders. 



15 FEATURES OF PROTEIN ENCODED BY GENE NO: 46 

This gene is expressed in fetal liver and spleen, and to a lesser extent in bone 
marrow, umbilical vein, and T cells. 

Therefore, polynucleotides and polypeptides of the invention are useful as 
reagents for differential identification of the tissue(s) or cell type(s) present in a 

20 biological sample and for diagnosis of diseases and conditions, which include, but are 
not limited to, disorders of the immune system, particularly hematopoiesis. Similarly, 
polypeptides and antibodies directed to these polypeptides are useful in providing 
immunological probes for differential identification of the tissue(s) or cell type(s). For 
a number of disorders of the above tissues or cells, particularly of the hematopoiesis 

25 and immune disorders, expression of this gene at significantly higher or lower levels 
may be routinely detected in certain tissues (e.g., cancerous and wounded tissues) or 
bodily fluids (e.g., serum, plasma, urine, synovial fluid or spinal fluid) or another 
tissue or cell sample taken from an individual having such a disorder, relative to the 
standard gene expression level, i.e., the expression level in healthy tissue or bodily 

30 fluid from an individual not having the disorder. Preferred epitopes include those 
comprising a sequence shown in SEQ ID NO: 279 as residues: Asp-30 to Glu-57. 

The tissue distribution indicates that polynucleotides and polypeptides 
corresponding to this gene are useful for diagnosis and treatment of hematopieotic and 
immune disorders. 



35 
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FEATURES OF PROTEIN ENCODED BY GENE NO: 47 

The translation product of this gene shares sequence homology with a 12 kD 
nucleic acid binding protein of Feline calcivirus which is thought to be important in viral 
replication. ( See Accession No. 59264) 

This gene is expressed primarily in human cardiomyopathy and to a lesser 
extent in T helper cells, fetal brain and synovial sarcoma. 

Therefore, polynucleotides and polypeptides of the invention are useful as 
reagents for differential identification of the tissue(s) or cell type(s) present in a 
biological sample and for diagnosis of diseases and conditions, which include, but are 
not limited to, cardiomyopathy as well as viral infection. Similarly, polypeptides and 
antibodies directed to these polypeptides are useful in providing immunological probes 
for differential identification of the tissue(s) or cell type(s). For a number of disorders 
of the above tissues or cells, particularly of the cardiovascular system, expression of 
this gene at significantly higher or lower levels may be routinely detected in certain 
15 tissues (e.g., cancerous and wounded tissues) or bodily fluids (e.g., serum, plasma, 
urine, synovial fluid or spinal fluid) or another tissue or cell sample taken from an 
individual having such a disorder, relative to the standard gene expression level, i.e., 
the expression level in healthy tissue or bodily fluid from an individual not having the 
disorder. Preferred epitopes include those comprising a sequence shown in SEQ ID 
20 NO: 280 as residues: Trp-20 to Cys-26. 

The tissue distribution in cardiomyopathy and homology to viral 12 kD nucleic 
acid binding protein indicates that polynucleotides and polypeptides corresponding to 
this gene are useful for diagnosis and intervention of cardiomyopathy, including those 
caused by ischemic, hypertensive, congenital, valvular, or pericardial abnormalities. 
25 The gene expression pattern may be the consequence or the cause for these conditions. 

FEATURES OF PROTEIN ENCODED BY GENE NO: 48 

The translation product of this gene shares sequence homology with tumor 
necrosis factor related gene product which is thought to be important in tumor necrosis, 
30 bacterial and viral infection, immune diseases and immunoreactions. 

This gene is expressed primarily in colon and to a lesser extent in ovarian and 
breast cancers. 

Therefore, polynucleotides and polypeptides of the invention are useful as 
reagents for differential identification of the tissue(s) or cell type(s) present in a 
35 biological sample and for diagnosis of diseases and conditions, which include, but are 
not limited to, tumors of colon, ovary or breast origins. Similarly, polypeptides and 
antibodies directed to these polypeptides are useful in providing immunological probes 



WO 98/54963 




T/LS98/11422 



*9 



for differentia] identification of the tissue(s) or cell type(s). For a number of disorders 
of the above tissues or cells, particularly of the colon, ovary and breast, expression of 
this gene at significantly higher or lower levels may be routinely detected in certain 
tissues (e.g., cancerous and wounded tissues) or bodily fluids (e.g., serum, plasma. 
5 urine, synovial fluid or spinal fluid) or another tissue or cell sample taken from an 
individual having such a disorder, relative to the standard gene expression level, i.e., 
the expression level in healthy tissue or bodily fluid from an individual not having the 
disorder. 

The tissue distribution and homology to Tumor necrosis factors indicates that 
10 polynucleotides and polypeptides corresponding to this gene are useful for intervention 
of cancers of colon, ovary and breast origins, because TNF family members are known 
to be involved in the tumor development. 

FEATURES OF PROTEIN ENCODED BY GENE NO: 49 

15 The translation product of this gene shares sequence homology with mucins, 

such as epithelial mucin, which is thought to be important in extracellular matrix 
functions such as protection, lubrication and cell adhesion (See for example Accession 
No. R68002). Preferred polypeptide fragments comprise the following amino acid 
sequence: PRSRPALRPGRQRPPSHSATSGVLRPRKKPDP (SEQ ID NO:500). 

20 Also preferred are polynucleotide fragments encoding these polypeptide fragments. 

Moreover, this gene maps to chromosome 22ql 1.2-qter, and therefore, can be used as 
a marker in linkage analysis for chromosome 22. 

This gene is expressed primarily in corpus colosum. 

Therefore, polynucleotides and polypeptides of the invention are useful as 

25 reagents for differential identification of the tissue(s) or cell type(s) present in a 

biological sample and for diagnosis of diseases and conditions, which include, but are 
not limited to, tumors, especially of corpus colosum, as well as metastatic lesions. 
Similarly, polypeptides and antibodies directed to these polypeptides are useful in 
providing immunological probes for differential identification of the tissue(s) or cell 

30 type(s). For a number of disorders of the above tissues or cells, particularly of the 
corpus colosum and other solid tissues, expression of this gene at significantly higher 
or lower levels may be routinely detected in certain tissues (e.g., cancerous and 
wounded tissues) or bodily fluids (e.g., serum, plasma, urine, synovial fluid or spinal 
fluid) or another tissue or cell sample taken from an individual having such a disorder, 

35 relative to the standard gene expression level, i.e., the expression level in healthy tissue 
or bodily fluid from an individual not having the disorder. 
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The tissue distribution and homology to mucins indicates that polynucleotides 
and polypeptides corresponding to this gene are useful for serum tumor markers or 
immunotherapy targets because tumor cells have greatly elevated level of mucin 
expression and shed the molecules into the epithelial tissues. 

FEATURES OF PROTEIN ENCODED BY GENE NO: 50 

This gene is expressed primarily in CD34 depleted buffy coat cord blood and 
primary dendritic cells. 

Therefore, polynucleotides and polypeptides of the invention are useful as 
reagents for differential identification of the tissue(s) or cell type(s) present in a 
biological sample and for diagnosis of diseases and conditions, which include, but are 
not limited to, hematopoietic disorders and immunological disorders. Similarly, 
polypeptides and antibodies directed to these polypeptides are useful in providing 
immunological probes for differential identification of the tissue(s) or cell type(s). For 
1 5 a number of disorders of the above tissues or cells, particularly of the hematopoietic and 
immune systems, expression of this gene at significantly higher or lower levels may be 
routinely detected in certain tissues (e.g., cancerous and wounded tissues) or bodily 
fluids (e.g., serum, plasma, urine, synovial fluid or spinal fluid) or another tissue or 
cell sample taken from an individual having such a disorder, relative to the standard 
20 gene expression level, i.e., the expression level in healthy tissue or bodily fluid from an 
individual not having the disorder. 

The tissue distribution in CD34 depleted buffy coat cord blood and primary 
dendritic cells indicates that polynucleotides and polypeptides corresponding to this 
gene are useful for diagnosis and treatment of hematopoietic and immune disorders. 
25 Secreted or cell surface proteins in the above tissue distribution often are involved in 
cell activation (e.g. cytokines) or molecules involved in cell surface activation. 

FEATURES OF PROTEIN ENCODED BY GENE NO: 51 

The translation product of this gene shares sequence homology with Interferon 
30 induced 1-8 gene encoded polypeptide which is thought to be important in binding to 
retroviral rev responsive element. Preferred polypeptide fragment comprise the 
following amino acid sequences: MTLITPSXKLTFXKGNKSWSSRACSSTLVDP 
(SEQ ID NO:501). Also preferred are polynucleotide fragments encoding these 
polypeptide fragments. 
35 This gene is expressed primarily in CD34 positive cells and neutrophils. 

Therefore, polynucleotides and polypeptides of the invention are useful as 
reagents for differential identification of the tissue(s) or cell type(s) present in a 
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biological sample and for diagnosis of diseases and conditions, which include, but are 
not limited to, retroviral infection, such as AIDS, and other immune disorders. 
Similarly, polypeptides and antibodies directed to these polypeptides are useful in 
providing immunological probes for differential identification of the tissue(s) or cell 
5 type(s). For a number of disorders of the above tissues or cells, particularly of the 
immune system, expression of this gene at significantly higher or lower levels may be 
routinely detected in certain tissues (e.g., cancerous and wounded tissues) or bodily 
fluids (e.g., serum, plasma, urine, synovial fluid or spinal fluid) or another tissue or 
cell sample taken from an individual having such a disorder, relative to the standard 

10 gene expression level, i.e., the expression level in healthy tissue or bodily fluid from an 
individual not having the disorder. Preferred epitopes include those comprising a 
sequence shown in SEQ ID NO: 284 as residues: Gln-51 to Trp-62. 

The tissue distribution and homology to interferon induced gene 1-8 indicates 
that polynucleotides and polypeptides corresponding to this gene are useful for 

15 intervention of retroviral infection including HIV. The factor may be involved in viral 
stability or viral entry into the cells. Alternatively, the virus/factor complex may elicit 
the cellular immune reaction. 

FEATURES OF PROTEIN ENCODED BY GENE NO: 52 

20 This gene shares sequence homology to immunoglobulin lambda chain (See 

Accession No. 2865484). Therefore it is likely that this gene has activity similar to an 
immunoglobulin lambda chain. Preferred polypeptide fragments comprise the following 
amino acid sequence: GHPSPALSIAPSDGSQLPCDEVPYGEAHVTRYCKKPLTNS 
HLETEAQSSSL (SEQ ID NO:502). Also preferred are polynucleotide fragments 

25 encoding these polypeptide fragments. 

This gene is expressed primarily in Hodgkin's lymphoma. 
Therefore, polynucleotides and polypeptides of the invention are useful as 
reagents for differential identification of the tissue(s) or cell type(s) present in a 
biological sample and for diagnosis of diseases and conditions, which include, but are 

30 not limited to, Hodgkin's lymphoma and other immune disorders. Similarly, 

polypeptides and antibodies directed to these polypeptides are useful in providing 
immunological probes for differential identification of the tissue(s) or cell type(s). For 
a number of disorders of the above tissues or cells, particularly of the immune system, 
expression of this gene at significantly higher or lower levels may be routinely detected 

35 in certain tissues (e.g., cancerous and wounded tissues) or bodily fluids (e.g., serum, 
plasma, urine, synovial fluid or spinal fluid) or another tissue or cell sample taken from 
an individual having such a disorder, relative to the standard gene expression level, i.e.. 
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the expression level in healthy tissue or bodily fluid from an individual not having the 
disorder. Preferred epitopes include those comprising a sequence shown in SEQ ID 
NO: 285 as residues: Pro-27 to Thr-32. 

The tissue distribution in Hodgkin's lymphoma and the sequence homology 
5 indicates that polynucleotides and polypeptides corresponding to this gene are useful for 
diagnosis of Hodgkin's lymphoma, since the elevated expression and secretion by the 
tumor mass may be indicative of tumors of this type. Additionally the gene product may 
be used as a target in the immunotherapy of the cancer.Because the gene is expressed in 
cells of lymphoid origin, the natural gene product may be involved in immune 
10 functions. Therefore it may be also used as an agent for immunological disorders 

including arthritis, asthma, immune deficiency diseases such as AIDS, and leukemia. 

FEATURES OF PROTEIN ENCODED BY GENE NO: 53 

This gene has extensive homology to cDNA for Homo sapiens mRNA for the 

1 5 ISLR gene(See Accession No. AB003 1 84). This protein is considered to be a new 
member of the Ig superfamily and contains a leucine-rich repeat (LRR) with conserved 
flanking sequences and a C2-type immunoglobulin (Ig)-like domain. These domains are 
important for protein-protein interaction or cell adhesion, and therefore it is possible that 
the novel protein ISLR may also interact with other proteins or cells. The ISLR gene 

20 was mapped on human chromosome 15q23-q24 by fluorescence in situ hybridization 
(See Medline Article No. 97468140). Homology to the ISLR gene has been confirmed 
by another independent group as well (See Accession No. Hs. 102 171) 

This gene is expressed in a number of tissues including human retina, heart, 
skeletal muscle, prostate, ovary, small intestine, thyroid, adrenal cortex, testis, 

25 stomach, spinal cord, fetal lung and fetal kidney tissues, colon, tonsil and stomach 
cancer, and to a lesser extent in endometrial stromal cells treated with estradiol, breast 
tissue, synovium, lymphoma, and number of other tumors. 

Therefore, polynucleotides and polypeptides of the invention are useful as 
reagents for differential identification of the tissue(s) or cell type(s) present in a 

30 biological sample and for diagnosis of diseases and conditions, which include, but are 
not limited to, tumors of colon, ovary and breast origins. However, due to the wide 
range of expression in various tissues, protein may play a vital role in the development 
of cancer in other tissues as well, not just those mentioned above. Similarly, 
polypeptides and antibodies directed to these polypeptides are useful in providing 

35 immunological probes for differential identification of the tissue(s) or cell type(s). For 
a number of disorders of the above tissues or cells, particularly of the colon, ovary and 
breast, expression of this gene at significantly higher or lower levels may be routinely 
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detected in certain tissues (e.g., cancerous and wounded tissues) or bodily fluids (e.g., 
serum, plasma, urine, synovia] fluid or spinal fluid) or another tissue or cell sample 
taken from an individual having such a disorder, relative to the standard gene 
expression level, i.e., the expression level in healthy tissue or bodily fluid from an 
5 individual not having the disorder. Additionally, this gene maps to chromosome 15q23- 
q24, and therefore, can be used as a marker in linkage analysis for chromosome 15. 

The tissue distribution in tumors of colon, ovary, and breast origins indicates 
that polynucleotides and polypeptides corresponding to this gene are useful for 
diagnosis and intervention of these tumors, in addition to other tumors where 
10 expression has been indicated. Protein, as well as, antibodies directed against the 

protein may show utility as a tumor marker and/or immunotherapy targets for the above 
listed tumors and tissues. 

FEATURES OF PROTEIN ENCODED BY GENE NO: 54 

15 Gene has homology to multidrug resistance gene 1 (See Accession No. 

P06795). Preferred polynucleotide fragments comprise the following sequence: 
GCTTCGTGTCCAACCCTCTTGCCCTTCGCCTGTGTGCCTGGAGCCAGTCCCA 
CCACGCTCGCGTTTCCTCCTGTAGTGCTCACAGGTCCCAGCACCGATGGCA 
TTCCCTTTGCCCTGAGTCTGCAGCGGGTCCCTTTTGTGCTTCCTTCCCCTCA 

20 GGTAGCCTCTCTCCCCCTGGGCCACTCCCGGGGGTGAGGGGGTTACCCCTT 

ccc agtg rnrn attcctgtggggctc acccc a aagt att aa aagt agcttt 

GTAA (SEQ ID NO:503). Also preferred are polypeptide fragments encoded by these 
polynucleotide fragments. 

This gene is expressed primarily in lung, esophagus, leukemia (Jurkat cells) and 

25 breast cancers and to a lesser extent in macrophages treated with GM-CSF fetal tissues 
and wide range of tissues. 

Therefore, polynucleotides and polypeptides of the invention are useful as 
reagents for differential identification of the tissue(s) or cell type(s) present in a 
biological sample and for diagnosis of diseases and conditions which include, but are 

30 not limited to, cancer of wide range of origins. Similarly, polypeptides and antibodies 
directed to these polypeptides are useful in providing immunological probes for 
differential identification of the tissue(s) or cell type(s). For a number of disorders of 
the above tissues or cells, particularly of the solid tumors, lung and leukemia, 
expression of this gene at significantly higher or lower levels may be routinely detected 

35 in certain tissues (e.g., cancerous and wounded tissues) or bodily fluids (e.g., serum, 
plasma, urine, synovial fluid or spinal fluid) or another tissue or cell sample taken from 
an individual having such a disorder, relative to the standard gene expression level, i.e.. 
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the expression level in healthy tissue or bodily fluid from an individual not having the 



function of the multidrug resistance protein 1 gene as the efflux pump responsible for 
low-drug accumulation in multidrug-resistant cells, protein as well mutants thereof, 
5 may also be beneficial as a target for gene therapy, particularly for the chronic patient. 
Preferred epitopes include those comprising a sequence shown in SEQ ID NO: 287 as 
residues: Met-1 to Lys-16. 

The tissue distribution in wide range of cancers and fetal tissues indicates that 
polynucleotides and polypeptides corresponding to this gene arc useful for detection of 
10 cells in active proliferation, such as cancers. The gene products may be used for cancer 
markers or immunotherapy target. 

FEATURES OF PROTEIN ENCODED BY GENE NO: 55 

This gene maps to the X chromosome. 

1 5 This gene is expressed primarily in the brain and to a lesser extent in the 

developing embryo. 

Therefore, polynucleotides and polypeptides of the invention are useful as 
reagents for differential identification of the tissue(s) or cell type(s) present in a 
biological sample and for diagnosis of diseases and conditions, which include, but are 

20 not limited to, neurodegenerative disease states and developmental disorders. Similarly, 
polypeptides and antibodies directed to these polypeptides are useful in providing 
immunological probes for differential identification of the tissue(s) or cell type(s). For 
a number of disorders, including sex-linked disorders, of the above tissues or cells, 
particularly of the neurological, developmental systems, and cardiovascular system, 

25 expression of this gene at significantly higher or lower levels may be routinely detected 
in certain tissues (e.g., cancerous and wounded tissues) or bodily fluids (e.g., serum, 
plasma, urine, synovial fluid or spinal fluid) or another tissue or cell sample taken from 
an individual having such a disorder, relative to the standard gene expression level, i.e., 
the expression level in heaJthy tissue or bodily fluid from an individual not having the 

30 disorder. Moreover, this gene maps to the X chromosome, and therefore, may be used 
as a marker in linkage analysis for this chromosome. 

The tissue distribution indicates that polynucleotides and polypeptides 
corresponding to this gene are useful for the detection/treatment of neurodegenerative 
disease states and behavioral disorders such as Alzheimer's Disease, Parkinson's 

35 Disease, Huntington's Disease, Klinefelter s, schizophrenia, mania, dementia, 

paranoia, obsessive compulsive disorder and panic disorder. In addition, the gene or 
gene product may also play a role in the treatment and/or detection of developmental 



disorder. Furthermore, due to the high expression level in lung tissue and the proposed 
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disorders associated with the developing embryo, sexually-iinked disorders, or 
disorders of the cardiovascular system. 

FEATURES OF PROTEIN ENCODED BY GENE NO: 56 

5 The translation product of this gene shares sequence homology with paxiilin 

which is thought to be important in mediating signal transduction from growth factor 
receptors to the cytoskeleton. Preferred polynucleotide fragments comprise the 
following sequence: TGGCTCACTGTCTTACAATCACTGCTGTGGAATCATGA 
TACCACT1TTAGCTCITTGCATC 

10 AAGTAGATITTAACTGGACAACTTTGAGTACT 

GGCTTGTGGTTTCAA (SEQ ID NO:506). Also preferred are polypeptide fragments 
encoded by these polynucleotide fragments. More preferably, polypeptide fragments 
comprise the amino acid sequence: LDELMAHLTEMQAKVAVRAD 
AGKKHLPDKQDHKASLDSMLGGLEQELQDLGIATVPKGHCASCQKPIAGKV1 

15 HALGQSWHPEHFVCTHCKEEIGSSPFFERSGLXYCPNDYHQLFSPRCAYCAAP 
ILDKVLTAMNQTWHPEHFFCSHCGEVFGAEGFHEKDKKPYCRKDFLAMFSPK 
CGGCNRPVLENYLSAMDTVWHPECFVCGDCFTSFSTGSFFELDGRPFCELHYH 
HRRGTLCHGCGQPITGRCISAMGYKFHPEHFVCAFCLTQLSKGIFREQNDKTY 
CQPCFNKLF (SEQ ID NO: 507); KASLDSMLGGLEQELQDLGIATVPKGHC 

20 ASCQKPIAGKVIHAL (SEQ ID NO:508); CPNDYHQLFSPRCAYCAAPILDKVL 
TAMNQTWHPEHFFCSHCGEVFGAEG (SEQ ID NO:509); DKKPYCRKDFLAM 
FSPKCGGCNRPVLENYLSAMDTVWHPECFVCGDCFTSFSTGSFFELDGRPFCE 
L (SEQ ID NO:510); CGQPITGRCISAMGYKFHPEHFVCAFCLTQLSKGIFRE 
QNDKTYCQ (SEQ ID NO:5 1 1 ). Polynucleotide fragments encoding these preferred 

25 polypeptide fragments are also contemplated. 

This gene is expressed primarily in brain, and to a lesser extent in the 
developing embryo. 

Therefore, polynucleotides and polypeptides of the invention are useful as 
reagents for differential identification of the tissue(s) or cell type(s) present in a 

30 biological sample and for diagnosis of diseases and conditions, which include, but are 
not limited to, neurological disease states and developmental abnormalities. Similarly, 
polypeptides and antibodies directed to these polypeptides are useful in providing 
immunological probes for differential identification of the tissue(s) or cell type(s). For 
a number of disorders of the above tissues or cells, particularly of the immune and 

35 nervous systems, expression of this gene at significantly higher or lower levels may be 
routinely detected in certain tissues (e.g., cancerous and wounded tissues) or bodily 
fluids (e.g., serum, plasma, urine, synovial fluid or spinal fluid) or another tissue or 
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cell sample taken from an individual having such a disorder, relative to the standard 
gene expression level, i.e.. the expression level in healthy tissue or bodily fluid from an 
individual not having the disorder. Moreover, since this gene shares homology with a 
gene that maps to chromosome 11, (See Accession No.T87404), gene as well as its 
5 translated product may be used for linkage analysis on chromosome 1 1 . 

The tissue distribution and homology to paxiliin indicates that polynucleotides 
and polypeptides corresponding to this gene are useful for the treatment and or detection 
of disease states associated with abnormal signal transduction in brain and/or the 
developing embryo. This would include treatment or detection of neurodegenerative 
10 disease states and behavioral disorders such as Alzheimer's Disease, Parkinson's 
Disease, Huntingtons Disease, schizophrenia, mania, dementia, paranoia, obsessive 
compulsive disorder and panic disorder and also in the treatment and or detection of 
embryonic development defects. 

15 FEATURES OF PROTEIN ENCODED BY GENE NO: 57 

This gene is expressed primarily in fetal spleen, brain, and to a lesser extent in 
six week old embryo. 

Therefore, polynucleotides and polypeptides of the invention are useful as 
reagents for differential identification of the tissue(s) or cell type(s) present in a 

20 biological sample and for diagnosis of diseases and conditions, which include, but are 
not limited to, immune disorders, neurological disorders, and developmental 
abnormalities. Similarly, polypeptides and antibodies directed to these polypeptides are 
useful in providing immunological probes for differential identification of the tissue(s) 
or cell type(s). For a number of disorders of the above tissues or cells, particularly of 

25 the immune and developmental systems, expression of this gene at significantly higher 
or lower levels may be routinely detected in certain tissues (e.g., cancerous and 
wounded tissues) or bodily fluids (e.g., serum, plasma, urine, synovial fluid or spinal 
fluid) or another tissue or cell sample taken from an individual having such a disorder, 
relative to the standard gene expression level, i.e., the expression level in healthy tissue 

30 or bodily fluid from an individual not having the disorder. Preferred epitopes include 
those comprising a sequence shown in SEQ ID NO: 290 as residues: Arg-28 to Gly-34. 

The expression of this gene in fetal spleen indicates that polynucleotides and 
polypeptides corresponding to this gene are useful for treatment/detection of immune 
disorders such as arthritis, asthma, immune deficiency diseases such as AIDS, and 

35 leukemia. In addition the expression of this gene in the early embryo, indicates a key 
role in embryo development and hence the gene or gene product could be used in the 
treatment and or detection of embryonic development defects. This would include 
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treatment or detection of neurodegenerative disease states and behavioral disorders such 
as Alzheimer's Disease, Parkinson's Disease. Huntintons Disease, schizophrenia, 
mania, dementia, paranoia, obsessive compulsive disorder and panic disorder and also 
in the treatment and or detection of embryonic development defects. 

5 

FEATURES OF PROTEIN ENCODED BY GENE NO: 58 

The translation product of this gene shares sequence homology with the gene disrupted 
in the neurodegenerative disease dentatorubal-pallidoluysian atrophy. Moreover a long 
open reading fame exists in an alternative frame. Preferred polypeptide fragments 

10 comprise the following: 

MGSSQSVEIPGGGTEGYHVLRVQENSPGHRAGLEPFFDFIVSINGSRLNKDND 
TLKDLLKXNVEKPVKMLIYSSKTLELRETSVTPSNLWGGQGLLGVSIRFCSFD 
GANENVWHVLEVESNSPAALAGLRPHSDYIIGADTVMNESEDLFSLIETHEAKP 
LKLYVYNTDTDNCREVIITPNSAWGGEGSLGCGIGYGYLHRIPTRPFEEGKKIS 

1 5 LPGQMAGTPITPLKDGFTEVQLSS VNPPSLSPPGTTGIEQSLTGLSISSTPPA VSS 
VLSTGVPTVPLLPPQVNQSLTSVPPMNPATTLPGLMPLPAGLPNLPNLNLNLPA 
PHIMPGVGLPELVNPGLPPLPSMPPRNLPGIAPLPLPSEFLPSFPLVPESSSAASS 
GELLSSLPPTSNAPSDPATTTAKADAASSLTVDVTPPTAKAPTTVEDRVGDSTPV 
SEKPVSAAVDANASESP (SEQ ID NO:512); SVEIPGGGTEGYHVLRVQENSPGH 

20 RAGLEPFFDFIVSINGSRLNKDNDTLKDLLKXNVEKPVKMLIYSSKTLELRETS 
VTPSNLWGGQGLLGVSIRFCSFDGANENVWH (SEQ ID NO:513); ESNSPAA 
LAGLRPHSDYIIGADTVMNESEDLFSLIETHEAKPLKLYVYNTDTDNCREVHTP 
NSAWGGEGSLGCGIGYGYLHRIPTRPFEEGKKISLPGQMAGTPITPLKDGFTEV 
QLSSVNPPSLSPPGTTGIEQSLTG LSISS (SEQ ID NO:514); RIPTRPFEEGKK1 

25 SLPGQMAGTPITPLKDGFTEVQLSSVNPPSLSPPGTTGIEQSLTGLSISSTPPAVS 
SVLSTGVPTVPLLPPQVNQSLTSVPPMNPATTLPGLMPLPAGLPNLPNLNLNLP 
APHIM PG VGLPEL VNPGLPPLPS M PPRN (SEQ ID NO:516); PGLPPLPSMPPRN 
LPGIAPLPLPSEFLPSFPLVPESSSAASSGELLSSLPPTSNAPSDPATTTAKADAA 
SSLTVDVTPPTAKAPTTVEDRVGDSTPVSEKPVSAAVDAN (SEQ ID NO:517). 

30 This gene is expressed primarily in prostate cancer, and to a lesser extent in the 

pineal glands and in fetal lung. 

Therefore, polynucleotides and polypeptides of the invention are useful as 
reagents for differential identification of the tissue(s) or cell type(s) present in a 
biological sample and for diagnosis of diseases and conditions, which include, but are 

35 not limited to, neurological conditions and pulmonary disorders. Similarly, 

polypeptides and antibodies directed to these polypeptides are useful in providing 
immunological probes for differential identification of the tissue(s) or cell type(s). For 
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a number of disorders of the above tissues or ceils, particularly of the nervous, 
pulmonary, and endocnnc systems, expression of this gene at significantly higher or 
lower levels may be routinely detected in certain tissues (e.g., cancerous and wounded 
tissues) or bodily fluids (e.g., serum, plasma, urine, synovial fluid or spinal fluid) or 
5 another tissue or cell sample taken from an individual having such a disorder, relative to 
the standard gene expression level, i.e., the expression level in healthy tissue or bodily 
fluid from an individual not having the disorder. Preferred epitopes include those 
comprising a sequence shown in SEQ ID NO: 291 as residues: Asn-9 to Leu- 14. 
The abundance of this gene in the pineal gland and its homology to a gene 

10 disrupted in the neurodegenerative disease state Dentatorubral-pallidoluysian atrophy 
indicates that this gene may be useful in the treatment and/or detection of other 
neurodegenerative disease states and behavioral disorders such as Alzheimer's Disease, 
Parkinson's Disease, Huntingtons Disease, schizophrenia, mania, dementia, paranoia, 
obsessive compulsive disorder and panic disorder. The abundance of this gene in fetal 

15 lung would suggest that misregulation of the expression of this protein product in the 
adult could lead to lymphoma or sarcoma formation, particularly in the lung; that it may 
also be involved in predisposition to certain pulmonary defects such as pulmonary 
edema and embolism, bronchitis and cystic fibrosis; and thus the gen or the gene 
protein encoded by the gene could be used in the detection and/or treatment of these 

20 pulmonary disorders. 

FEATURES OF PROTEIN ENCODED BY GENE NO: 59 

This gene is expressed primarily in the developing embryo. 

Therefore, polynucleotides and polypeptides of the invention are useful as 

25 reagents for differential identification of the tissue(s) or cell type(s) present in a 

biological sample and for diagnosis of diseases and conditions, which include, but are 
not limited to, developmental abnormalities. Similarly, polypeptides and antibodies 
directed to these polypeptides are useful in providing immunological probes for 
differential identification of the tissue(s) or cell type(s). For a number of disorders of 

30 the above tissues or cells, particularly of the developmental system, expression of this 
gene at significantly higher or lower levels may be routinely detected in certain tissues 
(e.g., cancerous and wounded tissues) or bodily fluids (e.g., serum, plasma, urine, 
synovial fluid or spinal fluid) or another tissue or cell sample taken from an individual 
having such a disorder, relative to the standard gene expression level, i.e., the 

35 expression level in healthy tissue or bodily fluid from an individual not having the 
disorder. 
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The expression of this gene primarily in the embryo, indicates the gene plays a 
key role in embryo development and that the gene or the protein encoded by the gene 
could be used in the treatment and or detection of developmental defects in the embryo 
or in infants. 

5 

FEATURES OF PROTEIN ENCODED BY GENE NO: 60 

This gene displays homology to nestin, an intermediate filament protein, the 
expression of which correlates with the proliferation of Central Nervous System 
progenitor cells and that is useful in the identification of brain tumors. This gene maps 

10 to chromosome L and therefore, may be used as a marker in linkage analysis for 
chromosome I (See Accession No. AA527348). 

This gene is expressed primarily in kidney and to a lesser extent in brain. 
Therefore, polynucleotides and polypeptides of the invention are useful as 
reagents for differential identification of the tissue(s) or cell type(s) present in a 

1 5 biological sample and for diagnosis of diseases and conditions, which include, but are 
not limited to, renal disorders and neurodegenerative conditions. Similarly, 
polypeptides and antibodies directed to these polypeptides arc useful in providing 
immunological probes for differential identification of the tissue(s) or cell type(s). For 
a number of disorders of the above tissues or cells, particularly of the excretory and 

20 nervous systems, expression of this gene at significantly higher or lower levels may be 
routinely detected in certain tissues (e.g., cancerous and wounded tissues) or bodily 
fluids (e.g., serum, plasma, urine, synovial fluid or spinal fluid) or another tissue or 
cell sample taken from an individual having such a disorder, relative to the standard 
gene expression level, i.e., the expression level in healthy tissue or bodily fluid from an 

25 individual not having the disorder. Preferred epitopes include those comprising a 
sequence shown in SEQ ID NO: 293 as residues: Thr-128 to Asn-135. 

The tissue distribution and homology to nestin indicates that polynucleotides 
and polypeptides corresponding to this gene are useful for detection and/or treatment of 
neurodegenerative disease states and behavioral disorders such as Alzheimer's Disease, 

30 Parkinson's Disease, Huntingtons Disease, schizophrenia, mania, dementia, paranoia, 
obsessive compulsive disorder and panic disorder. In addition, its abundance in kidney 
indicates that it is useful in the treatment and detection of acute renal failure and other 
disease states associated with the kidney. 



35 



FEATURES OF PROTEIN ENCODED BY GENE NO: 61 

Gene shares homology with the latrophilin-related protein 1 precursor as well as 
the calcium-independent alpha-latrotoxin receptor. Preferred polypeptide fragments 
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comprise the following amino acid sequence: 

IYKVFRHTAGLKPEVSCFENIRSCARXXXXXXXXXXXXWIFGVLHVVHASVV 
TAYLFTVSN.APQGMFIFLFLCVLSRKIQEEYYRLFKNVPCC (SEQ ID NO:518); 
WIFGVLHVVI1ASVVTAY1.FTVSNAFQGMFIFLFLCVLSRKIQEEYYRLFKNVPC 
5 C (SEQ ID N0:5 19). Also preferred are polynucleotide fragments encoding these 
polypeptide fragments. (See Accession No. 2213659) The translation product of this 
gene shares sequence homology with CD 97, a seven transmembrane bound receptor. 
This gene is expressed primarily in infant brain and in endothelial cells. 
Therefore, polynucleotides and polypeptides of the invention are useful as 

10 reagents for differential identification of the tissue(s) or cell type(s) present in a 

biological sample and for diagnosis of diseases and conditions, which include, but are 
not limited to, neurological disorders and hematopoeitic disorders. Similarly, 
polypeptides and antibodies directed to these polypeptides are useful in providing 
immunological probes for differential identification of the tissue(s) or cell type(s). For 

1 5 a number of disorders of the above tissues or cells, particularly of the neurological and 
hematopoeitic systems, expression of this gene at significantly higher or lower levels 
may be routinely detected in certain tissues (e.g., cancerous and wounded tissues) or 
bodily fluids (e.g., serum, plasma, urine, synovial fluid or spinal fluid) or another 
tissue or cell sample taken from an individual having such a disorder, relative to the 

20 standard gene expression level, i.e., the expression level in healthy tissue or bodily 
fluid from an individual not having the disorder. Preferred epitopes include those 
comprising a sequence shown in SEQ ID NO: 294 as residues: Lys-13 to Leu-21. 

The tissue distribution of this gene suggest that it may be useful in the detection 
and/or treatment of neurodegenerative disease states and behavioral disorders such as 

25 Alzheimer's Disease, Parkinson's Disease, Huntingtons Disease, schizophrenia, mania, 
dementia, paranoia, obsessive compulsive disorder and panic disorder, while its 
expression in hematopoietic cell types indicates that the gene could be important for the 
treatment or detection of immune or hematopoietic disorders including arthritis, asthma 
and immunodeficiency diseases. 

30 

FEATURES OF PROTEIN ENCODED BY GENE NO: 62 

This gene is expressed primarily in fetal liver and fetal spleen. 

Therefore, polynucleotides and polypeptides of the invention are useful as 
reagents for differential identification of the tissue( s) or cell type(s) present in a 
35 biological sample and for diagnosis of diseases and conditions, which include, but are 
not limited to, hematological and immunological disorders. Similarly, polypeptides and 
antibodies directed to these polypeptides are useful in providing immunological probes 
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tor differential identification of the tissue(s) or cell typc(s). For a number of disorders 
of the above tissues or cells, particularly of the immune and hematopoetic systems, 
expression of this gene at significantly higher or lower levels may be routinely detected 
in certain tissues (e.g., cancerous and wounded tissues) or bodily fluids (e.g., serum, 
5 plasma, urine, synovial fluid or spinal fluid) or another tissue or cell sample taken from 
an individual having such a disorder, relative to the standard gene expression level, i.e., 
the expression level in healthy tissue or bodily fluid from an individual not having the 
disorder. Preferred epitopes include those comprising a sequence shown in SEQ ID 
NO: 295 as residues: Ser-91 to Lys-98. 
10 The tissue distribution of this gene fetal liver and spleen indicates that the gene 

could be important for the treatment or detection of immune or hematopoietic disorders 
including arthritis, leukemia, asthma and immunodeficiency diseases. 

FEATURES OF PROTEIN ENCODED BY GENE NO: 63 

15 Gene shares homology with human serum amyloid protein. Preferred polypeptide 
fragments comprise the following amino acid sequence: 

ALTRIPPGDWVINVTAVSFAGKTTARFFHSSPPSLGDQARTDPGHQRRD (SEQ 
ID NO:520) (See Accession No. W 13671). Also preferred are polynucleotide 
fragments encoding these polypeptide fragments This gene maps to chromosome 9, and 

20 therefore, may be used as a marker in linkage analysis for chromosome 9 (See 
Accession No. AA004342). 

This gene is expressed primarily in fetal liver and spleen. 
Therefore, polynucleotides and polypeptides of the invention are useful as 
reagents for differential identification of the tissue(s) or cell type(s) present in a 

25 biological sample and for diagnosis of diseases and conditions, which include, but are 
not limited to, hematopoietic and immune disorders. Similarly, polypeptides and 
antibodies directed to these polypeptides are useful in providing immunological probes 
for differential identification of the tissue(s) or cell type(s). For a number of disorders 
of the above tissues or cells, particularly of the hematopoietic and immune systems, 

30 expression of this gene at significantly higher or lower levels may be routinely detected 
in certain tissues (e.g., cancerous and wounded tissues) or bodily fluids (e.g., serum, 
plasma, urine, synovial fluid or spinal fluid) or another tissue or cell sample taken from 
an individual having such a disorder, relative to the standard gene expression level, i.e., 
the expression level in healthy tissue or bodily fluid from an individual not having the 

35 disorder. 



WO 98/54963 





T/US98/11422 



52 



The tissue distribution of this gene in fetal liver-spleen indicates that the gene 
could be important for the treatment or detection of immune or hematopoietic disorders 
including arthritis, leukemia, asthma, and immunodeficiency diseases. 

5 FEATURES OF PROTEIN ENCODED BY GENE NO: 64 

This gene maps to chromosome 3, and therefore, may be used as a marker in 
linkage analysis for chromosome 3 (See Accession No. A A2 19669). 



10 reagents for differential identification of the tissue(s) or cell type(s) present in a 

biological sample and for diagnosis of diseases and conditions, which include, but are 
not limited to, neurodegenerative disease states. Similarly, polypeptides and antibodies 
directed to these polypeptides are useful in providing immunological probes for 
differential identification of the tissue(s) or cell type(s). For a number of disorders of 

1 5 the above tissues or cells, particularly of the neurological systems, expression of this 
gene at significantly higher or lower levels may be routinely detected in certain tissues 
(e.g., cancerous and wounded tissues) or bodily fluids (e.g., serum, plasma, urine, 
synovial fluid or spinal fluid) or another tissue or cell sample taken from an individual 
having such a disorder, relative to the standard gene expression level, i.e., the 

20 expression level in healthy tissue or bodily fluid from an individual not having the 
disorder. 

The tissue distribution indicates that polynucleotides and polypeptides 
corresponding to this gene are useful for the detection/treatment of neurodegenerative 
disease states and behavioral disorders such as Alzheimer's Disease, Parkinson's 
25 Disease, Huntintons Disease, schizophrenia, mania, dementia, paranoia, obsessive 
compulsive disorder and panic disorder. 

FEATURES OF PROTEIN ENCODED BY GENE NO: 65 

Gene shares homology with a yeast protein. Preferred polypeptide fragments 
30 comprise the following amino acid sequence: LQEVNITLPENSVWYERYKFDIP 

VFHL (SEQ ID NO:52 1 ). Also preferred are polynucleotide fragments encoding these 
polypeptide fragments. (See Accession No. 1332638) 



35 reagents for differential identification of the tissue(s) or cell type(s) present in a 

biological sample and for diagnosis of diseases and conditions, which include, but are 
not limited to, liver disorders and cancers (e.g. hepatoblastoma). Similarly, 



This gene is expressed specifically in the brain. 

Therefore, polynucleotides and polypeptides of the invention are useful as 



This gene is expressed primarily in fetal tissue (fetus and fetal liver). 
Therefore, polynucleotides and polypeptides of the invention are useful as 
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polypeptides and antibodies directed to these polypeptides are useful in providing 
immunological probes for differential identification of the tissue(s) or cell type(s). For 
a number of disorders of the above tissues or cells, particularly of the hepatic system, 
expression of this gene at significantly higher or lower levels may be routinely detected 
5 in certain tissues (e.g., cancerous and wounded tissues) or bodily fluids (e.g., serum, 
plasma, urine, synovial fluid or spinal fluid) or another tissue or cell sample taken from 
an individual having such a disorder, relative to the standard gene expression level, i.e., 
the expression level in healthy tissue or bodily fluid from an individual not having the 
disorder. Preferred epitopes include those comprising a sequence shown in SEQ ID 

10 NO: 298 as residues: Asn-59 to Glu-64. 

The tissue distribution indicates that polynucleotides and polypeptides 
corresponding to this gene are useful for the detection and treatment of liver disorders 
and cancers (e.g. hepatoblastoma, jaundice, hepatitis, liver metabolic diseases and 
conditions that are attributable to the differentiation of hepatocyte progenitor cells). In 

1 5 addition the expression in fetus would suggest a useful role for the protein product in 
developmental abnormalities, fetal deficiencies, pre-natal disorders and various would- 
healing models and/or tissue trauma. 

FEATURES OF PROTEIN ENCODED BY GENE NO: 66 

20 Gene has homology with a B-cell surface antigen which may indicate gene plays 

a role in the immune response, including, but not limited to disorders and infections of 
the immune system. Preferred polynucleotide fragments comprise the following 
sequence : T AGC ATGT AGCC AGTCG A ATAACNT AT A AGG AC AA AGTGG AGTC 
CACGCGTGCGGCCGTCTAGACTAGTGGATCCCCCGGCTGCAGGATTCGGC 

25 ACGAG (SEQ ID NO:523). Also preferred are polypeptide fragments encoded by 
these polynucleotide fragments (See Accession No.T94535). Additionally, this gene 
shares homology with an interferon-gamma receptor. Preferred polypeptide fragments 
also comprise the following amino acid sequence: MQGSGSQFRACLLCLCFSCPC 
SPGGPRWNSRQGGRRFPKTCRAISQNLVFKYKTFCPVRYMQPHRSSLCLHFTS 

30 YVFILSTWGSLRTYSTDLKKKKKNSRGGPVPIRPKS (SEQ ID NO:522); 

MQGSGSQFRACLLCLCFSCPCSPGGPRWNSRQGGRRFPKTCRAISQNLVFK 
(SEQ ID NO:524); PVRYMQPHRSSLCLHFTSYVFDLSTWGSLRTYSTDLKKKKK 
NSRGGPVPIRPKS (SEQ ID NO:525); and GEEQRDCSLGWRGVGMRATHCQAA 
RMFVLFSLPKYAGL (SEQ ID NO:526). Also preferred are polynucleotide fragments 

35 encoding these polypeptide fragments 

This gene is expressed primarily in T-cells and gall bladder. 
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Therefore, polynucleotides and polypeptides of the invention are useful as 
reagents for differential identification of the tissue(s) or cell type(s) present in a 
biological sample and for diagnosis of diseases and conditions, which include, but are 
not limited to, immunological disorders and conditions (immunodeficiencies, cancer, 
5 leukemia, hematopoeisis). Similarly, polypeptides and antibodies directed to these 

polypeptides are useful in providing immunological probes for differential identification 
of the tissue(s) or cell typc(s). For a number of disorders of the above tissues or cells, 
particularly of the immune and digestive systems, expression of this gene at 
significantly higher or lower levels may be routinely detected in certain tissues (e.g., 

10 cancerous and wounded tissues) or bodily fluids (e.g., serum, plasma, urine, synovial 
fluid or spinal fluid) or another tissue or cell sample taken from an individual having 
such a disorder, relative to the standard gene expression level, i.e., the expression level 
in healthy tissue or bodily fluid from an individual not having the disorder. Preferred 
epitopes include those comprising a sequence shown in SEQ ID NO: 299 as residues: 

15 Thr-41 toGly-52. 

The tissue distribution indicates that polynucleotides and polypeptides 
corresponding to this gene are useful for the treatment and diagnosis of immune 
disorders including: leukemias, lymphomas, auto-immune disorders, immuno- 
supressive (transplantation) and immunodeficiencies (e.g. AIDS), inflammation and 

20 hematopoeitic disorders. The expression of this gene in gall bladder would suggest a 
possible role for this gene product in digestive disorders, particularly of the pancreas. 

FEATURES OF PROTEIN ENCODED BY GENE NO: 67 

This gene maps to chromosome 1 1, and therefore, may be used as a marker in 
25 linkage analysis for chromosome 1 1 (See Accession No. AA01 1622). 

This gene is expressed primarily in a variety of fetal and developmental tissues 
(e.g. fetal spleen, infant brain). 

Therefore, polynucleotides and polypeptides of the invention are useful as 
reagents for differential identification of the tissue(s) or cell type(s) present in a 
30 biological sample and for diagnosis of diseases and conditions, which include, but are 
not limited to, developmental, immune or neurological abnormalities. Similarly, 
polypeptides and antibodies directed to these polypeptides are useful in providing 
immunological probes for differential identification of the tissue(s) or cell type(s). For 
a number of disorders of the above tissues or cells, particularly of the developing 
35 immune and central nervous systems, expression of this gene at significantly higher or 
lower levels may be routinely detected in certain tissues (e.g., cancerous and wounded 
tissues) or bodily fluids (e.g., serum, plasma, urine, synovial fluid or spinal fluid) or 
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another tissue or cell sample taken from an individual having such a disorder, relative to 
the standard gene expression level, i.e., the expression level in healthy tissue or bodily 
fluid from an individual not having the disorder. Preferred epitopes include those 
comprising a sequence shown in SEQ ID NO: 300 as residues: Ser-38 to Ser-43. 



corresponding to this gene are useful for developmental abnormalities or fetal 
deficiencies. The detection in infant brain would suggest a role in neurological disorders 
(both developmental and neurodegenerative conditions of the brain and nervous system, 
behavioral disorders, depression, schizophrenia, Alzheimer's disease, Parkinson's 
10 disease, Huntington's disease, mania, dementia). In addition, the detection in spleen 
would similarly suggest a role in detection and treatment of immunologically mediated 
disorders (e.g. immunodeficiency, inflammation, cancer, wound healing, tissue repair, 
hematopoeisis). 

15 FEATURES OF PROTEIN ENCODED BY GENE NO: 68 

This gene is expressed primarily in spleen, T-cells, and fetal heart. 
Therefore, polynucleotides and polypeptides of the invention are useful as 
reagents for differential identification of the tissue(s) or cell type(s) present in a 
biological sample and for diagnosis of diseases and conditions, which include, but are 

20 not limited to, immunological deficiencies, including AIDSand cardiovascular 

disorders. Similarly, polypeptides and antibodies directed to these polypeptides are 
useful in providing immunological probes for differential identification of the tissue(s) 
or cell type(s). For a number of disorders of the above tissues or cells, particularly of 
the immune and cardiovascular systems, expression of this gene at significantly higher 

25 or lower levels may be routinely detected in certain tissues (e.g., cancerous and 

wounded tissues) or bodily fluids (e.g., serum, plasma, urine, synovial fluid or spinal 
fluid) or another tissue or cell sample taken from an individual having such a disorder, 
relative to the standard gene expression level, i.e., the expression level in healthy tissue 
or bodily fluid from an individual not having the disorder. 

30 The tissue distribution indicates that polynucleotides and polypeptides 

corresponding to this gene are useful for the diagnosis and treatment of immune 
disorders including: leukemias, lymphomas, autoimmune disorders, 
immunodeficiencies (e.g. AIDS), immuno-suppressive conditions (transplantation) and 
hematopoeitic disorders. The expression in fetal heart indicates that polynucleotides and 

35 polypeptides corresponding to this gene are useful for the treatment and diagnosis of 
cadiovascular disorders (e.g. heart disease, restenosis, atherosclerosis, stoke, angina, 
thrombosis). 



The tissue distribution indicates that polynucleotides and polypeptides 
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FEATURES OF PROTEIN ENCODED BY GENE NO: 69 

Gene shares homology with a human collagen protein. Preferred polypeptide 
fragments comprise the following amino acid sequence: 
5 MPRKTSKCRQLLCSGASRNADTAARQSTCSSHRPPGK1PSLGPRRXPGCXSVP 
SSRGEQSTGSPAAPRCGRRDAHRGLPGGAAMTPGDTWASFNPRAGHSKSQGE 
GQESSGASRQDRHPVSHWVERQREAWGAPRSSSAGGVKVAATTEREPEFKIK 
TGKA (SEQ ID NO:527); CSGASRNADTAARQSTCSSHRPPGKIPSLGPRRXPG 
CXSVPSSRGEQSTGSPAAPRCGRRDAHRGLPGGAAMTPGDTWASFNPRAGHS 
10 (SEQ ID NO:528); QGEGQESSGASRQDRHPVSHWVERQREAWGAPRSSSAGG 
V K V A ATTEREPEFKIKTG KA (SEQ ID NO:529) (See Accession No. 124886). Also 
preferred are polynucleotide fragments encoding these polypeptide fragments 

This gene is expressed primarily in fetal heart. 

Therefore, polynucleotides and polypeptides of the invention are useful as 

15 reagents for differential identification of the tissue(s) or cell type(s) present in a 

biological sample and for diagnosis of diseases and conditions, which include, but are 
not limited to, cardiovascular disorders. Similarly, polypeptides and antibodies directed 
to these polypeptides are useful in providing immunological probes for differential 
identification of the tissue(s) or cell type(s). For a number of disorders of the above 

20 tissues or cells, particularly of the cardiovascular system, expression of this gene at 
significantly higher or lower levels may be routinely detected in certain tissues (e.g., 
cancerous and wounded tissues) or bodily fluids (e.g., serum, plasma, urine, synovial 
fluid or spinal fluid) or another tissue or cell sample taken from an individual having 
such a disorder, relative to the standard gene expression level, i.e., the expression level 

25 in healthy tissue or bodily fluid from an individual not having the disorder. Preferred 
epitopes include those comprising a sequence shown in SEQ ID NO: 302 as residues: 
Pro-32 to Ser-39. 

The tissue distribution indicates that polynucleotides and polypeptides 
corresponding to this gene are useful for the treatment and diagnosis of cadiovascular 

30 disorders (e.g. heart disease, restenosis, atherosclerosis, stroke, angina, thrombosis). 

FEATURES OF PROTEIN ENCODED BY GENE NO: 70 

The translation product of this gene shares sequence homology with a chicken 
single-strand DNA-binding protein. Preferred polypeptide fragments comprise the 
35 following amino acid sequence: 

MSPRYPGGPRPPLRIPNQALGGVPGSQPLLPSGMDPTRQQGHPNMGGPMQRM 
TPPRGMVPLGPQNYGGAMRPPLNALGGPGMPGMNMGPGGGRPWPNPTNAN 



W O 98 54963 



T TS98/1 1422 



57 

SIPYSSASPGNYVGPPGGGGPPGTPIMPSPADSTNSGDNMYTLMNAVPPGPNR 
PNFPMGPGSDGPMGGLGGMESHHMNGSLGSGDMDSISKNSPNNMSLSNQP 
GTPRDDGEMGGNFLNPFQSESYSPSMTMSV (SEQ ID NO:530); MSPRYPGG 
PRPPLRIPNQALGGVPGSQPLLPSGMDPTRQQGHPNMGGPMQRMTPPRGMVP 
5 LGPQNYGGAMRPPLNALGGPGMPGMNMGPGGGRPWPNPTNANSIPYSSASP 
GNY (SEQ ID. NO:531): LNALGGPGMPGMNMGPGGGRPWPNPTNANS1PYSS 
ASPGNYVGPPGGGGPPGTPIMPSPADSTNSGDNMYTLMNAVPPGPN (SEQ ID 
NO:532); GPMGGLGGMESHHMNGSLGSGDMDSISKNSPNNMSLSNQPGTPR 
DDGEMGGNFLNPFQSESYSPSMTMSV (SEQ ID NO:533); TCEHSSEAKAFHDY 

10 (SEQ ID NO:534). Also preferred are polynucleotide fragments encoding these 
polypeptide fragments. (See Accession No. 1562534) 

This gene is expressed primarily in placenta and to a lesser extent in the fetal 
heart and a variety of other tissues and cell types. 

Therefore, polynucleotides and polypeptides of the invention are useful as 

15 reagents for differential identification of the tissue(s) or cell type(s) present in a 

biological sample and for diagnosis of diseases and conditions, which include, but are 
not limited to, developmental abnormalities, fetal deficiencies, and particularly of the 
cardiovascular system. Similarly, polypeptides and antibodies directed to these 
polypeptides are useful in providing immunological probes for differential identification 

20 of the tissue(s) or cell type(s). For a number of disorders of the above tissues or cells, 
particularly of the reproductive system, expression of this gene at significantly higher or 
lower levels may be routinely detected in certain tissues (e.g., cancerous and wounded 
tissues) or bodily fluids (e.g., serum, plasma, urine, synovial fluid or spinal fluid) or 
another tissue or cell sample taken from an individual having such a disorder, relative to 

25 the standard gene expression level, i.e., the expression level in healthy tissue or bodily 
fluid from an individual not having the disorder. 

The tissue distribution indicates that polynucleotides and polypeptides 
corresponding to this gene are useful for the detection and treatment of developmental 
abnormalities or fetal deficiencies, ovarian and other endometrial cancers, reproductive 

30 dysfunction, cardiovascular disorders, and pre-natal disorders. 

FEATURES OF PROTEIN ENCODED BY GENE NO: 71 

This gene is expressed primarily in fetal liver and to a lesser extent in the breast 
and testes. 

35 Therefore, polynucleotides and polypeptides of the invention are useful as 

reagents for differential identification of the tissue(s) or cell type(s) present in a 
biological sample and for diagnosis of diseases and conditions, which include, but are 
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not limited to. liver disorders (including hepatoblastomas) and reproductive disorders. 
Similarly, polypeptides and antibodies directed to these polypeptides are useful in 
providing immunological probes for differential identification of the tissue(s) or cell 
type(s). For a number of disorders of the above tissues or ceils, particularly of the 
5 hepatic and reproductive systems, expression of this gene at significantly higher or 
lower levels may be routinely detected in certain tissues (e.g., cancerous and wounded 
tissues) or bodily fluids (e.g., serum, plasma, urine, synovial fluid or spinal fluid) or 
another tissue or cell sample taken from an individual having such a disorder, relative to 
the standard gene expression level, i.e., the expression level in healthy tissue or bodily 

10 fluid from an individual not having the disorder. 

The tissue distribution indicates that polynucleotides and polypeptides 
corresponding to this gene are useful for detection and treatment of liver disorders and 
cancers (e.g. hepatoblastoma, jaundice, hepatitis, liver metabolic diseases and 
conditions that are attributable to the differentiation of hepatocyte progenitor cells). The 

15 expression in testes and breast indicates that polynucleotides and polypeptides 

corresponding to this gene are useful for the detection and treatment of endocrine and 
reproductive disorders (e.g. sperm maturation, milk production, testicular and breast 
cancers). 

20 FEATURES OF PROTEIN ENCODED BY GENE NO: 72 

This gene maps to chromosome 1, and therefore, may be used as a marker in 
linkage analysis for chromosome 1 (See Accession No. W93595). 

This gene is expressed primarily in smooth muscle and to a lesser extent in 

brain. 

25 Therefore, polynucleotides and polypeptides of the invention are useful as 

reagents for differential identification of the tissue(s ) or cell type(s) present in a 
biological sample and for diagnosis of diseases and conditions, which include, but are 
not limited to, cardiovascular and neurological disorders. Similarly, polypeptides and 
antibodies directed to these polypeptides are useful in providing immunological probes 

30 for differential identification of the tissue(s) or cell type(s). For a number of disorders 
of the above tissues or cells, particularly of the cardiovascular and central nervous 
systems, expression of this gene at significantly higher or lower levels may be routinely 
detected in certain tissues (e.g., cancerous and wounded tissues) or bodily fluids (e.g., 
serum, plasma, urine, synovial fluid or spinal fluid) or another tissue or cell sample 

35 taken from an individual having such a disorder, relative to the standard gene 

expression level, i.e., the expression level in healthy tissue or bodily fluid from an 
individual not having the disorder. 
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The tissue distribution indicates thai polynucleotides and polypeptides 
corresponding to this gene are useful for the detection and treatment of restenosis, 
atherosclerosis, stroke, angina, thrombosis, wound healing and other conditions of 
heart disease. In addition, the expression in brain would suggest that polynucleotides 
and polypeptides corresponding to this gene are useful for the detection and treatment of 
developmental, degenerative and behavioral conditions of the brain and nervous system 
(e.g. schizophrenia, depression, Alzheimer's disease, Parkinson's disease, 
Huntington's disease, mania, dementia, paranoia, addictive behavior and sleep 
disorders). 



FEATURES OF PROTEIN ENCODED BY GENE NO: 73 

Gene shares homology with human stromalin-2. Preferred polypeptide 
fragments comprise the following amino acid sequence: 

QAFVLLSDLLLIFSPQMIVGGRDFLRPLVFFPEATLQSELASFLMDHVFIQPGDL 

15 GSGA (SEQ ID NO:535); ACSYLLCNPEFTFFSRADFARSQLVDLLTDRFQQE 
LEELLQVG (SEQ ID NO:536),QKQLSSLRDRMVAFCELCQSCLSDVDTEIQEQV 
ST (SEQ ID NO:537); Q VILP ALTL V YPS ILWTLTH I S KSD AS (SEQ ID NO:538); 
STHDLTRWELYEPCCQLLQKAVDTGXVPHQV (SEQ ID NO:539). Also preferred 
are polynucleotide fragments encoding these polypeptide fragments (See Accession 

20 No.R65208 ) This gene maps to chromosome 7, and therefore, may be used as a 
marker in linkage analysis for chromosome 7 (See Accession No. D52585). 

This gene is expressed primarily in the brain (infant brain, adult brain, pituitary, 
cerebellum, hippocampus, schizophrenic hypothalmus, amygdala). 

Therefore, polynucleotides and polypeptides of the invention are useful as 

25 reagents for differential identification of the tissue(s) or cell type(s) present in a 

biological sample and for diagnosis of diseases and conditions, which include, but are 
not limited to, developmental and neurodegenerative diseases of the brain and nervous 
system. Similarly, polypeptides and antibodies directed to these polypeptides are useful 
in providing immunological probes for differential identification of the tissue(s) or cell 

30 type(s). For a number of disorders of the above tissues or cells, particularly of the 
central nervous system, expression of this gene at significantly higher or lower levels 
may be routinely detected in certain tissues (e.g., cancerous and wounded tissues) or 
bodily fluids (e.g., serum, plasma, urine, synovial fluid or spinal fluid) or another 
tissue or cell sample taken from an individual having such a disorder, relative to the 

35 standard gene expression level, i.e., the expression level in healthy tissue or bodily 
fluid from an individual not having the disorder. Preferred epitopes include those 



WO 98/54963 



T/LS98/11422 



comprising a sequence shown in SEQ ID NO: 306 as residues: Thr-25 to Lys-36, Lys- 
55 to Ser-63. 

The tissue distribution indicates that polynucleotides and polypeptides 
corresponding to this gene are useful for detection and treatment of developmental 
5 degenerative and behavioral conditions of the brain and nervous system (e.g. 

schizophrenia, depression, Alzheimer's disease, Parkinson's disease, Huntington's 
disease, mania, dementia, paranoia, addictive behavior and sleep disorders). 

FEATURES OF PROTEIN ENCODED BY GENE NO: 74 

1 0 This gene is expressed primarily in the hypothalamus of a human suffering from 

schizophrenia. 

Therefore, polynucleotides and polypeptides of the invention are useful as 
reagents for differential identification of the tissue(s) or cell type(s) present in a 
biological sample and for diagnosis of diseases and conditions, which include, but are 

15 not limited to, disorders of the CNS particularly schizophrenia. Similarly, polypeptides 
and antibodies directed to these polypeptides are useful in providing immunological 
probes for differential identification of the tissue(s) or cell type(s). For a number of 
disorders of the above tissues or cells, particularly of the CNS, such as schizophrenia 
expression of this gene at significantly higher or lower levels may be routinely detected 

20 in certain tissues (e.g., cancerous and wounded tissues) or bodily fluids (e.g., serum, 
plasma, urine, synovial fluid or spinal fluid) or another tissue or cell sample taken from 
an individual having such a disorder, relative to the standard gene expression level, i.e., 
the expression level in healthy tissue or bodily fluid from an individual not having the 
disorder. Preferred epitopes include those comprising a sequence shown in SEQ ID 

25 NO: 307 as residues: GIy-38 to Ala-44. 

The tissue distribution indicates that the protein products of this gene are useful 
for the study, diagnosis and treatment of schizophrenia and other disorders involving 
the CNS. 

30 FEATURES OF PROTEIN ENCODED BY GENE NO: 75 

Preferred polypeptides of the invention comprise the following amino acid 
sequence encoded by this gene: 

LAVSTSFICCADISTALPLGSSRPAPAPRHREHEHGHQARPPRLLXTSLMPLSTP 
AAAQLLWTQLTPMGGRPGGRHSPPTLHTGPRALPPGPPHPSLHVAALSLLR 
35 (SEQ ID NO:540). Polynucleotides encoding such polypeptides are also provided. 

This gene is expressed primarily in endometrial tumor and to a lesser extent in 
amniotic cells. 
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Therefore, polynucleotides and polypeptides of the invention are useful as 
reagents for differential identification of the tissue(s) or cell type(s) present in a 
biological sample and for diagnosis of diseases and conditions, which include, but are 
not limited to, reproductive and immune disorders particularly cancers of those 
5 systems. Similarly, polypeptides and antibodies directed to these polypeptides are 

useful in providing immunological probes for differential identification of the tissue(s) 
or cell type(s). For a number of disorders of the above tissues or cells, particularly of 
the reproductive and immune systems, expression of this gene at significantly higher or 
lower levels may be routinely detected in certain tissues (e.g., cancerous and wounded 

10 tissues) or bodily fluids (e.g., serum, plasma, urine, synovial fluid or spinal fluid) or 
another tissue or cell sample taken from an individual having such a disorder, relative to 
the standard gene expression level, i.e., the expression level in healthy tissue or bodily 
fluid from an individual not having the disorder. Preferred epitopes include those 
comprising a sequence shown in SEQ ID NO: 308 as residues: Ser-3 to Arg-9. 

15 The tissue distribution indicates that the protein products of this gene are useful 

for study and treatment of immune and reproductive disorders particularly cancers of 
those systems. 

FEATURES OF PROTEIN ENCODED BY GENE NO: 76 

20 This gene is expressed primarily in kidney cortex and to a lesser extent in early- 

stage human brain. 

Therefore, polynucleotides and polypeptides of the invention are useful as 
reagents for differential identification of the tissue(s) or cell type(s) present in a 
biological sample and for diagnosis of diseases and conditions, which include, but are 

25 not limited to, renal disorders such as renal cancer. Similarly, polypeptides and 

antibodies directed to these polypeptides are useful in providing immunological probes 
for differential identification of the tissue(s) or ceil type(s). For a number of disorders 
of the above tissues or cells, particularly of the kidney expression of this gene at 
significantly higher or lower levels may be routinely detected in certain tissues (e.g., 

30 cancerous and wounded tissues) or bodily fluids (e.g., serum, plasma, urine, synovial 
fluid or spinal fluid) or another tissue or cell sample taken from an individual having 
such a disorder, relative to the standard gene expression level, i.e., the expression level 
in healthy tissue or bodily fluid from an individual not having the disorder. Preferred 
epitopes include those comprising a sequence shown in SEQ ID NO: 309 as residues: 

35 Gly-38 to Gly-45, Gly-47 to Gly-52, Pro-92 to Lys- 1 1 0. 

The tissue distribution indicates that the protein products of this gene are useful 
for study, treatment and diagnosis of renal diseases such as cancer of the kidney. 
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FEATURES OF PROTEIN ENCODED BY GENE NO: 77 

This gene is expressed primarily in kidney medulla. 
Therefore, polynucleotides and polypeptides of the invention are useful as 
5 reagents for differential identification of the tissue(s) or cell type(s) present in a 

biological sample and for diagnosis of diseases and conditions, which include, but are 
not limited to, metabolic and renal disorders. Similarly, polypeptides and antibodies 
directed to these polypeptides are useful in providing immunological probes for 
differential identification of the tissue(s) or cell type(s). For a number of disorders of 
10 the above tissues or cells, particularly of the metabolic and renal systems, expression of 
this gene at significantly higher or lower levels may be routinely detected in certain 
tissues (e.g., cancerous and wounded tissues) or bodily fluids (e.g., serum, plasma, 
urine, synovial fluid or spinal fluid) or another tissue or cell sample taken from an 
individual having such a disorder, relative to the standard gene expression level, i.e., 
15 the expression level in healthy tissue or bodily fluid from an individual not having the 
disorder. 

The tissue distribution indicates that the protein products of this gene are useful 
for study, treatment and diagnosis of metabolic and renal diseases and disorders. 

20 FEATURES OF PROTEIN ENCODED BY GENE NO: 78 

This gene is expressed in chronic synovitis and microvascular endothelium. 
Therefore, polynucleotides and polypeptides of the invention are useful as 
reagents for differential identification of the tissue(s) or cell type(s) present in a 
biological sample and for diagnosis of diseases and conditions, which include, but are 

25 not limited to, arthritis and atherosclerosis. Similarly, polypeptides and antibodies 
directed to these polypeptides are useful in providing immunological probes for 
differential identification of the tissue(s) or cell type(s). For a number of disorders of 
the above tissues or cells, particularly of the vascular and skeletal systems, expression 
of this gene at significantly higher or lower levels may be routinely detected in certain 

30 tissues (e.g., cancerous and wounded tissues) or bodily fluids (e.g., serum, plasma, 
urine, synovial fluid or spinal fluid) or another tissue or cell sample taken from an 
individual having such a disorder, relative to the standard gene expression level, i.e., 
the expression level in healthy tissue or bodily fluid from an individual not having the 
disorder. 

35 The tissue distribution indicates that the protein products of this gene are useful 

for study, diagnosis and treatment of arthritic and other inflammatory diseases as well 
as cardiovascular diseases. 
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FEATURES OF PROTEIN ENCODED BY GENE NO: 79 

This gene is expressed in resting T-cells and activated monocytes. 

Therefore, polynucleotides and polypeptides of the invention are useful as 
5 reagents for differential identification of the tissue(s) or cell type(s) present in a 

biological sample and for diagnosis of diseases and conditions, which include, but are 
not limited to, immune disorders. Similarly, polypeptides and antibodies directed to 
these polypeptides are useful in providing immunological probes for differential 
identification of the tissue(s) or cell type(s). For a number of disorders of the above 
10 tissues or cells, particularly of the immune system, expression of this gene at 

significantly higher or lower levels may be routinely detected in certain tissues (e.g., 
cancerous and wounded tissues) or bodily fluids (e.g., serum, plasma, urine, synovial 
fluid or spinal fluid) or another tissue or cell sample taken from an individual having 
such a disorder, relative to the standard gene expression level, i.e., the expression level 
15 in healthy tissue or bodily fluid from an individual not having the disorder. 

The tissue distribution indicates that the protein products of this gene are useful 
for the study and treatment of immune diseases such as inflammatory conditions. 

FEATURES OF PROTEIN ENCODED BY GENE NO: 80 

20 This gene is expressed in a variety of immune system tissues, e.g., neutrophils, 

T-cells, and TNF induced epithelial and endothelial cells. 

Therefore, polynucleotides and polypeptides of the invention are useful as 
reagents for differential identification of the tissue(s) or cell type(s) present in a 
biological sample and for diagnosis of diseases and conditions, which include, but are 

25 not limited to, infectious and immune disorders. Similarly, polypeptides and antibodies 
directed to these polypeptides are useful in providing immunological probes for 
differential identification of the tissue(s) or cell type(s). For a number of disorders of 
the above tissues or cells, particularly of the immune and vascular systems, expression 
of this gene at significantly higher or lower levels may be routinely detected in certain 

30 tissues (e.g., cancerous and wounded tissues) or bodily fluids (e.g., serum, plasma, 
urine, synovial fluid or spinal fluid) or another tissue or cell sample taken from an 
individual having such a disorder, relative to the standard gene expression level, i.e., 
the expression level in healthy tissue or bodily fluid from an individual not having the 
disorder. Preferred epitopes include those comprising a sequence shown in SEQ ID 

35 NO: 313 as residues: Met- 1 to Trp-6. 

The tissue distribution indicates that the protein products of this gene are useful 
for study and treatment of infectious diseases, immune and vascular disorders. 
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FEATURES OF PROTEIN ENCODED BY GENE NO: 81 

This gene is expressed in activated neutrophils. 

Therefore, polynucleotides and polypeptides of the invention are useful as 
5 reagents for differential identification of the tissue(s) or cell type(s ) present in a 

biological sample and for diagnosis of diseases and conditions, which include, but are 
not limited to, inflammation and other immune conditions. Similarly, polypeptides and 
antibodies directed to these polypeptides are useful in providing immunological probes 
for differential identification of the tissue(s) or cell type(s). For a number of disorders 
10 of the above tissues or cells, particularly of the immune system, expression of this gene 
at significantly higher or lower levels may be routinely detected in certain tissues (e.g., 
cancerous and wounded tissues) or bodily fluids (e.g., serum, plasma, urine, synovial 
fluid or spinal fluid) or another tissue or cell sample taken from an individual having 
such a disorder, relative to the standard gene expression level, i.e., the expression level 
15 in healthy tissue or bodily fluid from an individual not having the disorder. 

The tissue distribution indicates that the protein products of this gene are useful 
for study and treatment of immune disorders. 

FEATURES OF PROTEIN ENCODED BY GENE NO: 82 

20 This gene is expressed in neutrophils. 

Therefore, polynucleotides and polypeptides of the invention are useful as 
reagents for differential identification of the tissue(s) or cell type(s) present in a 
biological sample and for diagnosis of diseases and conditions, which include, but are 
not limited to, inflammatory and other immune conditions. Similarly, polypeptides and 

25 antibodies directed to these polypeptides are useful in providing immunological probes 
for differential identification of the tissue(s) or cell type(s). For a number of disorders 
of the above tissues or cells, particularly of the immune system, expression of this gene 
at significantly higher or lower levels may be routinely detected in certain tissues (e.g., 
cancerous and wounded tissues) or bodily fluids (e.g., serum, plasma, urine, synovial 

30 fluid or spinal fluid) or another tissue or cell sample taken from an individual having 

such a disorder, relative to the standard gene expression level, i.e., the expression level 
in healthy tissue or bodily fluid from an individual not having the disorder. Preferred 
epitopes include those comprising a sequence shown in SEQ ID NO: 315 as residues: 
Ala-83 to Thr-91. 

35 The tissue distribution indicates that the protein products of this gene are useful 

for study and treatment of immune disorders. 
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FEATURES OF PROTEIN ENCODED BY GENE NO: 83 

This gene is expressed in human neutrophils. 

Therefore, polynucleotides and polypeptides of the invention are useful as 
5 reagents for differential identification of the tissue(s) or cell type(s) present in a 

biological sample and for diagnosis of diseases and conditions, which include, but are 
not limited to, inflammation and immune disorders. Similarly, polypeptides and 
antibodies directed to these polypeptides are useful in providing immunological probes 
for differential identification of the tissue(s) or cell type(s). For a number of disorders 

10 of the above tissues or cells, particularly of the immune and inflammatory system, 

expression of this gene at significantly higher or lower levels may be routinely detected 
in certain tissues (e.g., cancerous and wounded tissues) or bodily fluids (e.g., serum, 
plasma, urine, synovial fluid or spina! fluid) or another tissue or cell sample taken from 
an individual having such a disorder, relative to the standard gene expression level, i.e., 

15 the expression level in healthy tissue or bodily fluid from an individual not having the 
disorder. 

The tissue distribution indicates that the protein products of this gene are useful 
for diagnosis and treatment of disorders of the inflammatory and immune systems. 

20 FEATURES OF PROTEIN ENCODED BY GENE NO: 84 

This gene is expressed in human neutrophils. 

Therefore, polynucleotides and polypeptides of the invention are useful as 
reagents for differential identification of the tissue(s) or cell type(s) present in a 
biological sample and for diagnosis of diseases and conditions, which include, but are 

25 not limited to, disorders of the inflammatory and immune systems. Similarly, 

polypeptides and antibodies directed to these polypeptides are useful in providing 
immunological probes for differential identification of the tissue(s) or cell type(s). For 
a number of disorders of the above tissues or cells, particularly of the inflammatory and 
immune systems, expression of this gene at significantly higher or lower levels may be 

30 routinely detected in certain tissues (e.g., cancerous and wounded tissues) or bodily 
fluids (e.g., serum, plasma, urine, synovial fluid or spinal fluid) or another tissue or 
cell sample taken from an individual having such a disorder, relative to the standard 
gene expression level, i.e., the expression level in healthy tissue or bodily fluid from an 
individual not having the disorder. 

35 The tissue distribution indicates that the protein products of this gene are useful 

for diagnosis and treatment of disorders of the immune and inflammatory systems. 
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FEATURES OF PROTEIN ENCODED BY GENE NO: 85 

This gene is expressed in activated neutrophils. 

Therefore, polynucleotides and polypeptides of the invention are useful as 
5 reagents for differential identification of the tissue(s) or cell type(s) present in a 

biological sample and for diagnosis of diseases and conditions, which include, but are 
not limited to, inflammation and immune system diseases. Similarly, polypeptides and 
antibodies directed to these polypeptides are useful in providing immunological probes 
for differential identification of the tissue(s) or cell type(s). For a number of disorders 

10 of the above tissues or cells, particularly of the immune system and inflammatory 

system, expression of this gene at significantly higher or lower levels may be routinely 
detected in certain tissues (e.g., cancerous and wounded tissues) or bodily fluids (e.g., 
serum, plasma, urine, synovial fluid or spinal fluid) or another tissue or cell sample 
taken from an individual having such a disorder, relative to the standard gene 

15 expression level, i.e., the expression level in healthy tissue or bodily fluid from an 
individual not having the disorder. 

The tissue distribution indicates that the protein products of this gene are useful 
for diagnosis and treatment of diseases of the inflammatory and immune systems. 

20 FEATURES OF PROTEIN ENCODED BY GENE NO: 86 

This gene is expressed in activated neutrophils. 

Therefore, polynucleotides and polypeptides of the invention are useful as 
reagents for differential identification of the tissue(s) or cell type(s) present in a 
biological sample and for diagnosis of diseases and conditions, which include, but are 

25 not limited to, inflammation and immune system disorders. Similarly, polypeptides and 
antibodies directed to these polypeptides are useful in providing immunological probes 
for differential identification of the tissue(s) or cell type(s). For a number of disorders 
of the above tissues or cells, particularly of the inflammatory and immune system, 
expression of this gene at significantly higher or lower levels may be routinely detected 

30 in certain tissues (e.g., cancerous and wounded tissues) or bodily fluids (e.g., serum, 
plasma, urine, synovial fluid or spinal fluid) or another tissue or cell sample taken from 
an individual having such a disorder, relative to the standard gene expression level, i.e.. 
the expression level in healthy tissue or bodily fluid from an individual not having the 
disorder. Preferred epitopes include those comprising a sequence shown in SEQ ID 

35 NO: 319 as residues: Met-1 to Gly-6, Gly-32 to Pro-43, Leu-55 to Gln-60. 

The tissue distribution indicates that the protein products of this gene are useful 
for diagnosis and treatment of disorders of the immune and inflammatory system. 
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FEATURES OF PROTEIN ENCODED BY GENE NO: 87 

In specific embodiments, polypeptides of the invention comprise the sequence: 
EQVLALLWPRFELILEMNVQSVRSTDPQRLGGLDTRPHYITRRYAEFSSALVSIN 
5 QTIPNERTMQLLGQLQVEVENFVLRVAAEFSSRKEQLVFLINNYDN4MLGVLMH 
RAADDSKEVESFQQLLNARTQEFIEELLSPPFGGLVAFVKEAE.ALIERGQAERLR 
GEEARVTQLIRGFGSSWKSSVESLSQDVMRSFTNFRNGTSIIQG (SEQ ID 
NO:541),ALLKYRFFYQFLLGNERATAKEIRDEYVETLSKIYLSYYRSYLGRLMK 
VQYEEVAEKDDLMGVEDTAKKGFXSKPSRSRNTIFTLGTRGSVISPTELEAPILV 

1 0 PHTAQR (SEQ ID NO: 542 ); EQR YPFEALFRSQH YXLLDNSCREYLFICEFFV VS 
GPXAHDLFHAVMGRTLSMTLKHLDSYLADCYDAIAVFLCIHIVLRFRN1AAKRD 
VPALDRYW (SEQ ID NO:543),GGLDTRPHYITRRYAEFSSALVSINQ (SEQ ID 
NO-.544); S R KEQL VFLINN YDMMLG VL (SEQ ID NO: 545) and/or ALLKYRFFY 
QFLLGNERATAKEIRDEYVETLSKIYLSYYRSYLGRLMKVQYEEVAEKDDLMG 

1 5 VEDT A KKGFXS KPSLRS RNTDFTLGTRGS VISPTELEAPILVPHTAQRXEQR YPF 
EALFRSQHYXLLDNSCREYLFICEFFVVSGPXAHDLFHAVMGRTLSMTLKHLD 
SYLAOCYDAlAVFLCmiVLRFRNIAAKRDVPALDRYWEQVLALLWPRFF.LIl.EM 
NVQSVRSTDPQRLGGLDTRPHY1TRRYAEFSSALVSINQTIPNERTMQLLGQLQV 
EVENIFVLRVAAEFSSRKEQLVFLINNYDMMLGVLMERAADDSKEVESFQQLLN 

20 ARTQEFIEELLSPPFGGLVAFVKEAEALIERGQAERLRGEEARVTQLIRGFGSSW 
KSSVESLSQDVMRSFTNFRNGTS (SEQ ID NO:546). Polynucleotides encoding 
these polypeptides are also encompassed by the invention. The translation product of 
this gene shares sequence homology with suppressor of actin mutation which is thought 
to be important in mutation suppression. 

25 This gene is expressed primarily in fetal liver and to a lesser extent in a variety 

of other tissues. 

Therefore, polynucleotides and polypeptides of the invention are useful as 
reagents for differential identification of the tissue(s) or cell type(s) present in a 
biological sample and for diagnosis of diseases and conditions, which include, but are 

30 not limited to, liver and mutations. Similarly, polypeptides and antibodies directed to 
these polypeptides are useful in providing immunological probes for differential 
identification of the tissue(s) or cell type(s). For a number of disorders of the above 
tissues or cells, particularly of the liver or cancer, expression of this gene at 
significantly higher or lower levels may be routinely detected in certain tissues (e.g., 

35 cancerous and wounded tissues) or bodily fluids (e.g.. serum, plasma, urine, synovial 
fluid or spinal fluid) or another tissue or cell sample taken from an individual having 
such a disorder, relative to the standard gene expression level, i.e., the expression level 
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in healthy tissue or bodily fluid from an individual not having the disorder. Preferred 
epitopes include those comprising a sequence shown in SEQ ID NO: 320 as residues: 
Val-53 to Arg-60, Thr-88 to Thr-94, Ala-142 to Ser-150, Gly-188 to Glu-196, Gly- 
208 to Ser-214, Thr-227 to Gly-232. Lys-279 to Phe-285. 
5 The tissue distribution and homology to suppressor of actin mutation suggest 

that polynucleotides and polypeptides corresponding to this gene are useful for 
diagnosis and of liver disorder or cancer. 

FEATURES OF PROTEIN ENCODED BY GENE NO: 88 

10 This gene maps to chromosome 9, and therefore can be used in linkage analysis 

as a marker for chromosome 9. In specific embodiments, polypeptides of the invention 
comprise the sequence: 

YEGKEFDYVFSn3VNEGGPSYKLPYNTSDDPWLTAYNFLQKNDLNPMFLDQVA 
KFIIDNTKGQMLGLGNPSFSDPFTGGGRYVPGSSGSSNTLPTADPFTGAGRYV 

15 PGSASMGTTMAGVDPFTGNSAYRSAASKTMNIYFPKKEAVTFDQANPTQILGK 
LK£LNGTAPEEKKLTEDDLILLEiaLSLICNSSSEKPTVQQLQILWKAINCPEDIV 
FPALDILRLSIKHPSVNENFCNEKEGAQFSSHLINLLNPKGKPANQLLALRTFC 
NCFVGQAGQKLMMSQRESLMSHAIELKSGSNKNI (SEQ ID NO: 547); 
HIALATLALNYSVCFHKD (SEQ ID NO: 548); HNIEGKAQCLSLISTILEVVQ 

20 DLEATFRLLVALGTLISDDSNAVQLAKS (SEQ ID NO:549); LGVDSQIKKYSS 

VSEPAKVSECCRFILNLL (SEQ ID NO:550); and/or YEGKEFDYVFSIDVNEGGPS 
YKLPYNTSDDPWLTAYNFLQKNDLNPMFLDQVAKFUDNTKGQMLGLGNPSFS 
DPFTGGGRYVPGSSGSSNTLPTADPFTGAGRYVPGSASMGTTMAGVDPFTGN 
SAYRSAASKTMNIYTPKKEAVTroQANFT 

25 LLEKJLSLICNSSSEKPTVQQLQILWKAINCPEDIVFPALDILRLSIKHPSVNENFC 
NEKEGAQFSSHLINLLNPKGKPANQLLALRTFCNCFVGQAGQKLMMSQRESL 
MSHAIELKSGSNKNIH1ALATLALNYSVCFHKDHNIEGKAQCLSLISTILEVVQD 
LEATFRLLVALGTLISDDSNAVQLAKSLGVDSQIKKYSSVSEPAKVSECCRFILN 
LL (SEQ ID NO:551). Polynucleotides encoding these polypeptides are also 

30 encompassed by the invention. These polypeptides share significant homology with 
phospholipase A2 activating protein which is thought to be important in signal 
transduction (see, e.g., Wang et al.. Gene 161(2):237-241 (1995)). 

This gene is expressed primarily in endothelial cells, to a less extent in placenta, 
endometrial stromal cells, osteosarcoma, testis tumor, muscle, and infant brain that are 

35 likely to be rich in blood vessles. 

Therefore, polynucleotides and polypeptides of the invention are useful as 
reagents for differential identification of the tissue(s) or cell type(s) present in a 
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biological sample and for diagnosis of diseases and conditions, w hich include, but are 
not limited to, disorders in vascular system, aberrent angiogenesis, tumor angiogenesis. 
Similarly, polypeptides and antibodies directed to these polypeptides are useful in 
providing immunological probes for differential identification of the tissue(s) or cell 
5 tvpe(s). For a number of disorders of the above tissues or cells, particularly of the 
vascular system or tumors, expression of this gene at significantly higher or lower 
levels may be routinely detected in certain tissues (e.g., cancerous and wounded 
tissues) or bodily fluids (e.g., serum, plasma, urine, synovial fluid or spinal fluid) or 
another tissue or cell sample taken from an individual having such a disorder, relative to 

10 the standard gene expression level, i.e., the expression level in healthy tissue or bodily 
fluid from an individual not having the disorder. 

The tissue distribution of this gene in endothelial cells and several potential 
highly vascularized tissues and its homology to phospholipase A2 activating protein 
suggest that this gene may be involved in transducing signals for endothelial cells in 

1 5 angiogenesis or vasculogenesis. 

FEATURES OF PROTEIN ENCODED BY GENE NO: 89 

In specific embodiments, polypeptides of the invention comprise the sequence: 
YPNQDGDILRDQVLHEHIQRLSKVVTANHRALQIPEVYLREAPWPSAQSEIRTIS 

20 AYKTPRDKVQCILRMCSTIMNLLSLANEDSVPGADDFVPV^ 

STVQYTSSFYASCLSGEESYWWMQFTAAVE (SEQ ID NO:552); YPNQDGDILR 
DQVLHEHIQRLSKVWANHRALQIPEVYLREAPWPSAQSEIRTISAYKTPRDKVQ 
CILRMCSTIMNLLSLANEDSVPGADDFVPVLVFVLIKANPPCLLSTVQYISSFYA 
SCLSGEES YWWMQFTAAVEFIKTI (SEQ ID NO:553); YPNQDGDILRDQVL (SEQ 

25 ID NO:554); EAPWPSAQSEI (SEQ ID NO:555); PVLVFVLIKANP (SEQ ID 
NO:560); SGEESYWWMQFTAAVEFIKTI (SEQ ID NO:556); ADDFVPVLVF 
VLIKANPP (SEQ ID NO:557); YKTPRDKVQCIL (SEQ ID NO:558); and/or 
GADDFVPVLVFVLIK (SEQ ID NO:559). The translation product of this gene shares 
sequence homology with human ras inhibitor and yeast VPS9p which is thought to be 

30 important in golgi vacuole transport. 

This gene is expressed primarily in T cells and melanocytes and to a lesser 
extent in a variety of other tissues. 

Therefore, polynucleotides and polypeptides of the invention are useful as 
reagents for differential identification of the tissue(s) or cell type(s) present in a 

35 biological sample and for diagnosis of diseases and conditions, which include, but are 
not limited to, dysfunction and disorders involving T cells and melanocytes. Similarly, 
polypeptides and antibodies directed to these polypeptides are useful in providing 
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immunological probes for differential identification of the tissue* s ) or cell type(s). For 
a number of disorders of the above tissues or cells, particularly of the immune system, 
expression of this gene at significantly higher or lower levels may be routinely detected 
in certain tissues (e.g., cancerous and wounded tissues) or bodily fluids (e.g., serum, 
5 plasma, urine, synovial fluid or spinal fluid) or another tissue or cell sample taken from 
an individual having such a disorder, relative to the standard gene expression level, i.e., 
the expression level in healthy tissue or bodily fluid from an individual not having the 
disorder. 

The tissue distribution and homology to ras inhibitor indicates that 
10 polynucleotides and polypeptides corresponding to this gene are useful for regulating 
signal transduction; diagnosis and treatment of disorders involving T cells and 
melanocytes. 

FEATURES OF PROTEIN ENCODED BY GENE NO: 90 

1 5 This gene maps to chromosome 9 and therefore polypeptides of the invention 

can be used in linkage analysis as a marker for chromosome 9. The translation product 
of this gene shares sequence homology with neuronal olfactomedin-related ER localized 
protein which is thought to be important in influence the maintenance, growth, or 
differentiation of chemosensory cilia on the apical dendrites of olfactory neurons. In 

20 specific embodiments, polypeptides of the invention comprise the sequence: 

SARASTQPPAGQHPGPC (SEQ ID NO:561); MPGRWRWQRDMHPARKLLSLL 
FLILMGTELTQD (SEQ ID NO:562); SAAPDSLLRSSKGSTRGSL (SEQ ID 
NO:563); AAIVIWRGKSESRIAKTPGI (SEQ ID NO:564); FRGGGTLVLPPTHT 
PEWLIL (SEQ ID NO:567); PLGITLPLGAPETGGGD (SEQ ID NO:565); and/or 

25 CAAETWKGSQRAGQLCALLA (SEQ ID NO:566). 
This gene is expressed in pineal gland. 

Therefore, polynucleotides and polypeptides of the invention are useful as 
reagents for differential identification of the tissue(s) or cell type(s) present in a 
biological sample and for diagnosis of diseases and conditions, which include, but are 

30 not limited to, neurological and endocrinological disorders. Similarly, polypeptides and 
antibodies directed to these polypeptides are useful in providing immunological probes 
for differential identification of the tissue(s) or cell type(s). For a number of disorders 
of the above tissues or cells, particularly of the neurological or endocrine systems, 
expression of this gene at significantly higher or lower levels may be routinely detected 

35 in certain tissues (e.g., cancerous and wounded tissues) or bodily fluids (e.g., serum, 
plasma, urine, synovial fluid or spinal fluid) or another tissue or cell sample taken from 
an individual having such a disorder, relative to the standard gene expression level, i.e., 
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the expression level in healthy tissue or bodily fluid from an individual not having the 
disorder. Preferred epitopes include those comprising a sequence shown in SEQ ID 
NO: 323 as residues: Leu-20 to Ala-26, Arg-32 to Arg-39, Thr- 1 04 to Gly- 1 1 2. 

The tissue distribution and homology to olfactomedin-related protein indicates 
5 that polynucleotides and polypeptides corresponding to this gene are useful for 

maintenance, growth, or differentiation of neuron cells in pineal gland, therefore, may 
be useful for diagnosis and treatment of neurological disorders in pineaJ gland. 

FEATURES OF PROTEIN ENCODED BY GENE NO: 91 

10 This gene is expressed primarily in prostate and apoptotic T cells. 

Therefore, polynucleotides and polypeptides of the invention are useful as 
reagents for differential identification of the tissue(s) or cell type(s) present in a 
biological sample and for diagnosis of diseases and conditions, which include, but are 
not limited to, prostate disease and T cell dysfunction. Similarly, polypeptides and 

15 antibodies directed to these polypeptides are useful in providing immunological probes 
for differential identification of the tissue(s) or cell type(s ). For a number of disorders 
of the above tissues or cells, particularly of the prostate cancer, expression of this gene 
at significantly higher or lower levels may be routinely detected in certain tissues (e.g., 
cancerous and wounded tissues) or bodily fluids (e.g., serum, plasma, urine, synovial 

20 fluid or spinal fluid) or another tissue or cell sample taken from an individual having 

such a disorder, relative to the standard gene expression level, i.e., the expression level 
in healthy tissue or bodily fluid from an individual not having the disorder. 

The tissue distribution indicates that polynucleotides and polypeptides 
corresponding to this gene are useful for detect abnormal activity in prostate and T cells 

25 or probably treatment of this abnormality. 

FEATURES OF PROTEIN ENCODED BY GENE NO: 92 

This gene is expressed primarily in prostate and to a lesser extent in smooth 
muscle cells, fibroblasts, and placenta. 

30 Therefore, polynucleotides and polypeptides of the invention are useful as 

reagents for differential identification of the tissue(s) or cell type(s) present in a 
biological sample and for diagnosis of diseases and conditions, which include, but are 
not limited to, disorders in prostate or vascular system. Similarly, polypeptides and 
antibodies directed to these polypeptides are useful in providing immunological probes 

35 for differential identification of the tissue(s) or cell type(s). For a number of disorders 
of the above tissues or cells, particularly of the prosate or vascular system, expression 
of this gene at significantly higher or lower levels may be routinely detected in certain 
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tissues (e.g., cancerous and wounded tissues) or bodily fluids (e.g., scrum, plasma, 
urine, synovial fluid or spinal fluid) or another tissue or cell sample taken from an 
individual having such a disorder, relative to the standard gene expression level, i.e., 
the expression level in healthy tissue or bodily fluid from an individual not having the 
5 disorder. 

The tissue distribution indicates that polynucleotides and polypeptides 
corresponding to this gene are useful for regulating function of prostate or highly 
vascularized tissues, e.g. placenta. 

10 FEATURES OF PROTEIN ENCODED BY GENE NO: 93 

This gene is expressed primarily in embryos and fetal tissues stage human and 
to a lesser extent in a wide variety of other proliferative tissues. 

Therefore, polynucleotides and polypeptides of the invention are useful as 
reagents for differential identification of the tissue(s) or cell type(s) present in a 

15 biological sample and for diagnosis of diseases and conditions, which include, but are 
not limited to, disorders in embryonic development and cell proliferation. Similarly, 
polypeptides and antibodies directed to these polypeptides are useful in providing 
immunological probes for differential identification of the tissue(s) or cell type(s). For 
a number of disorders of the above tissues or cells, particularly of the embryonic tissues 

20 and proliferative cells, expression of this gene at significantly higher or lower levels 
may be routinely detected in certain tissues (e.g., cancerous and wounded tissues) or 
bodily fluids (e.g., serum, plasma, urine, synovial fluid or spinal fluid) or another 
tissue or cell sample taken from an individual having such a disorder, relative to the 
standard gene expression level, i.e., the expression level in healthy tissue or bodily 

25 fluid from an individual not having the disorder. 

The tissue distribution indicates that polynucleotides and polypeptides 
corresponding to this gene are useful for diagnosis or treatment of abnormalities in 
developing and proliferative cells and organs. 

30 FEATURES OF PROTEIN ENCODED BY GENE NO: 94 

The translation product of this gene shares sequence homology with 
transformation related protein which is thought to be important in transformation. 

This gene is expressed primarily in female reproductive tissues, i.e., breast 
cancer cells, placenta, and ovary and to a lesser extent in fetal lung. 
35 Therefore, polynucleotides and polypeptides of the invention are useful as 

reagents for differential identification of the tissue(s) or cell type(s) present in a 
biological sample and for diagnosis of diseases and conditions, which include, but are 
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not limited to. cancer or dysfunction of reproductive tissues. Similarly, polypeptides 
and antibodies directed to these polypeptides are useful in providing immunological 
probes for differential identification of the tissue(s) or cell type(s). For a number of 
disorders of the above tissues or cells, particularly of the reproduction system, 
5 expression of this gene at significantly higher or lower levels may be routinely detected 
in certain tissues (e.g., cancerous and wounded tissues) or bodily fluids (e.g., serum, 
plasma, urine, synovial fluid or spinal fluid) or another tissue or cell sample taken from 
an individual having such a disorder, relative to the standard gene expression level, i.e., 
the expression level in healthy tissue or bodily fluid from an individual not having the 

10 disorder. Preferred epitopes include those comprising a sequence shown in SEQ ID 
NO: 327 as residues: Ser-50 to Pro-61. 

The tissue distribution and homology to transformation related protein indicates 
that polynucleotides and polypeptides corresponding to this gene are useful for 
diagnosis and treatment of conditions caused by transformation, i.e. tumorigenesis in 

15 reproductive organs, e.g. breast, placenta, and ovary. 

FEATURES OF PROTEIN ENCODED BY GENE NO: 95 

This gene is expressed primarily in testes, rhabdomyosarcoma, infant brain and 
to a lesser extent in some tumors and highly vascularized tissues. 

20 Therefore, polynucleotides and polypeptides of the invention are useful as 

reagents for differential identification of the tissue(s) or cell type(s) present in a 
biological sample and for diagnosis of diseases and conditions, which include, but are 
not limited to, tumorigenesis, abnormal angiogenesis, and/or neurological disorders. , 
Similarly, polypeptides and antibodies directed to these polypeptides are useful in 

25 providing immunological probes for differential identification of the tissue(s) or cell 
type(s). For a number of disorders of the above tissues or cells, particularly of the 
tumor tissues or vascular tissues, expression of this gene at significantly higher or 
lower levels may be routinely detected in certain tissues (e.g., cancerous and wounded 
tissues) or bodily fluids (e.g., serum, plasma, urine, synovial fluid or spinal fluid) or 

30 another tissue or cell sample taken from an individual having such a disorder, relative to 
the standard gene expression level, i.e., the expression level in healthy tissue or bodily 
fluid from an individual not having the disorder. Preferred epitopes include those 
comprising a sequence shown in SEQ ID NO: 328 as residues: Arg-46 to Trp-54, Pro- 
60 to Ile-69, Asn-1 16 to Ala- 122, Arg-147 to Lys-153, Ser-158 to Glu-170, Ue-399 to 

35 Ser-405, Pro-486 to Met-499, Pro-502 to Asp-508. 

The tissue distribution indicates that polynucleotides and polypeptides 
corresponding to this gene are useful for a range of disease states including treatment of 
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tumor or vascular disorders and the treatment of neurological disorders such as 
Alzheimer's Disease, Parkinson's Disease, Huntingtons Disease, schizophrenia, mania, 
dementia, paranoia, obsessive compulsive disorder and panic disorder. 

5 FEATURES OF PROTEIN ENCODED BY GENE NO: 96 

This gene maps to chromosome 7 and therefore polynucleotides of the present 
invention can be used in linkage analysis as a marker for chromosome 7. The 
translation product of this gene is homologous to the Clostridium perfringens 
enterotoxin (CPE) receptor gene product and shares sequence homology with a human 

10 ORF specific to prostate and a glycoprotein specific to oligodendrocytes both of which 
are tissue specific proteins. (See e.g., Katahira et al., J Cell Biol. 136(6): 1239-1247 
(1997). PMID: 9087440; UI: 97242441. 

This gene is expressed primarily in pancreas tumor and ulcerative colitis and to a 
lesser extent in several tumors and normal tissues. 

1 5 Therefore, polynucleotides and polypeptides of the invention are useful as 

reagents for differential identification of the tissue(s) or cell type(s) present in a 
biological sample and for diagnosis of diseases and conditions, which include, but are 
not limited to, pancreatic disorder, ulcerative colitis, tumors and food poisoning. 
Similarly, polypeptides and antibodies directed to these polypeptides are useful in 

20 providing immunological probes for differential identification of the tissue(s) or cell 
type(s). For a number of disorders of the above tissues or cells, particularly of the 
digestive system or tumorigenic system, expression of this gene at significantly higher 
or lower levels may be routinely detected in certain tissues (e.g., cancerous and 
wounded tissues) or bodily fluids (e.g., serum, plasma, urine, synovial fluid or spinal 

25 fluid) or another tissue or cell sample taken from an individual having such a disorder, 
relative to the standard gene expression level, i.e., the expression level in healthy tissue 
or bodily fluid from an individual not having the disorder. Preferred epitopes include 
those comprising a sequence shown in SEQ ID NO: 329 as residues: Gly-247 to Met- 
152, Cys-177 to Lys-188. 

30 The tissue distribution and homology to prostate and oligodendrocyte-specific 

protein indicates that polynucleotides and polypeptides corresponding to this gene are 
useful for marker of diagnosis or treatment of disorder in pancreas, ulcerative colitis, 
and tumors. Furthermore, identity to the human receptor for Clostridium perfringenes 
entertoxin indicates that the soluble portion of this receptor could be used in the 

35 treatment of food poisoning associated with Clostridia perfringens by blocking the 
activity of perfringens enterotoxin. 
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FEATURES OF PROTEIN ENCODED BY GENE NO: 97 

The translation product of this gene shares sequence homology with ATPase 
which is thought to be important in metabolism. 
5 This gene is expressed primarily in testes and several hematopoietic cells and to 

a lesser extent in other tissues. 

Therefore, polynucleotides and polypeptides of the invention are useful as 
reagents for differential identification of the tissue(s) or cell type(s) present in a 
biological sample and for diagnosis of diseases and conditions, which include, but are 
10 not limited to, leukemia and hematopoietic disorders. Similarly, polypeptides and 

antibodies directed to these polypeptides are useful in providing immunological probes 
for differential identification of the tissue(s) or cell type(s). For a number of disorders 
of the above tissues or cells, particularly of the hematopoietic system, expression of this 
gene at significantly higher or lower levels may be routinely detected in certain tissues 
15 (e.g., cancerous and wounded tissues) or bodily fluids (e.g., serum, plasma, urine, 
synovial fluid or spinal fluid) or another tissue or cell sample taken from an individual 
having such a disorder, relative to the standard gene expression level, i.e., the 
expression level in healthy tissue or bodily fluid from an individual not having the 
disorder. Preferred epitopes include those comprising a sequence shown in SEQ ID 
20 NO: 330 as residues: Leu-37 to Ala-42. 

The tissue distribution and homology to ATPase indicates that polynucleotides 
and polypeptides corresponding to this gene are useful for marker of diagnosis and 
treatment of leukemia and other hematopoietic disorders. 

25 FEATURES OF PROTEIN ENCODED BY GENE NO: 98 

In specific embodiments, polypeptides of the invention comprise the sequence: 
MRSARPSLGCLPSWAFSQALNI (SEQ ID NO:568); LLGLKGLAPAEISAVCE 
KGNFN (SEQ ID NO:569); VAHGLAWSYYIGYLRLILPELQARIR (SEQ ID 
NO:570); TYNQHYNNLLRGAVSQRC (SEQ ID NO:57 1 ); ILLPLDCG VPDNLS M 

30 ADPNIRFLDKLPQQTGDRAGIKDRVYSN (SEQ ID NO:572); SIYELLENGQRAGT 
CVLEYATPLQTLFAMSQYSQAGFSGEDRLEQ (SEQ ID NO:573); AKLFCRTLE 
DILADAPESQNNCRLIAYQEPADDSSFSLSQEVLRHLRQEEKEEVTVGSLKTSAV 
PSTSTMSQEPELLISGMEKPLPLRTDFS (SEQ ID NO:574); and/or LLGLKGLA 
PAEISAVCEKGNFNVAHGLAWSYYIGYLRLILPEL (SEQ ID NO:575). 

35 Polynucleotides encoding these polypeptides are also encompassed by the invention. 

This gene is expressed primarily in prostate BPH and to a lesser extent in bone 
marrow. 
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Therefore, polynucleotides and polypeptides of the invention are useful as 
reagents for differential identification of the tissue(s) or cell type(s) present in a 
biological sample and for diagnosis of diseases and conditions, which include, but are 
not limited to, benign prostatic hypertrophy or prostate cancer. Similarly, polypeptides 
5 and antibodies directed to these polypeptides are useful in providing immunological 
probes for differential identification of the tissue(s) or cell type(s). For a number of 
disorders of the above tissues or cells, particularly of the male urinary system, 
expression of this gene at significantly higher or lower levels may be routinely detected 
in certain tissues (e.g., cancerous and wounded tissues) or bodily fluids (e.g., serum. 

10 plasma, urine, synovial fluid or spinal fluid) or another tissue or cell sample taken from 
an individual having such a disorder, relative to the standard gene expression level, i.e., 
the expression level in healthy tissue or bodily fluid from an individual not having the 
disorder. Preferred epitopes include those comprising a sequence shown in SEQ ID 
NO: 331 as residues: Ile-60 to Asn-69, Leu- 106 to Asp- 1 12, Glu-130 to Gly-136, Phe- 

15 160 to GIu-167, Pro- 184 to Cys-190, Glu-197 to Ser-202, Arg-215 to Glu-221, Thr- 
237 to Pro-242. 

The tissue distribution indicates that polynucleotides and polypeptides 
corresponding to this gene are useful for diagnosis or treatment of benign prostatic 
hypertrophy or prostate cancer. 

20 

FEATURES OF PROTEIN ENCODED BY GENE NO: 99 

This gene is expressed primarily in salivary gland. 

Therefore, polynucleotides and polypeptides of the invention are useful as 
reagents for differential identification of the tissue(s) or cell type(s) present in a 

25 biological sample and for diagnosis of diseases and conditions, which include, but are 
not limited to, disorders or injuries of the salivary gland. Similarly, polypeptides and 
antibodies directed to these polypeptides are useful in providing immunological probes 
for differential identification of the tissue(s) or cell type(s). For a number of disorders 
of the above tissues or cells, particularly of glandular tissues, expression of this gene at 

30 significantly higher or lower levels may be routinely detected in certain tissues (e.g., 
cancerous and wounded tissues) or bodily fluids (e.g., serum, plasma, urine, synovial 
fluid or spinal fluid) or another tissue or cell sample taken from an individual having 
such a disorder, relative to the standard gene expression level, i.e., the expression level 
in healthy tissue or bodily fluid from an individual not having the disorder. 

35 The tissue distribution indicates that polynucleotides and polypeptides 

corresponding to this gene are useful for treatment of disorders of. or injuries to the 
salivary gland or other glandular tissue. 



WO 98/54963 Bl I S98 1 1422 



FEATURES OF PROTEIN ENCODED BY GENE NO: 100 

This gene maps to chromosome 15, accordingly, polynucleotides of the 
invention can be used in linkage analysis as a marker for chromosome 15. The 
5 translation product of this gene shares sequence homology with a C.elegans gene of 
unknown function. In specific embodiments, polypeptides of the invention comprise 
the sequence: DPRVRLNSLTCKHIFISLTQ (SEQ ID NO:583); TMKLLKLRRNIV 
KLSLYRHFTN (SEQ ID NO:576); TLIL A V A AS I VFHWTTMKFRI (SEQ ID 
NO:577); VTCQS DWREL W V DD AI WRLLFS MILF V I (SEQ ID NO:578); MVLWR 

1 0 PS ANNQRFAFSPLSEEEEEDEQ (SEQ ID NO:580); KEPMLKESFEGMKMRS 

TKQEPNGNSKVNKAQEDDL (SEQ ID NO:584); and/or KWVEENVPSSVTDVALP 
ALLDSDEERMITHFERSKME (SEQ ID NO:582). Polynucleotides encoding these 
polypeptides are also encompassed by the invention. 

This gene is expressed primarily in thyroid and to a lesser extent in 

15 osteoclastoma, kidney medulla, and lung. 

Therefore, polynucleotides and polypeptides of the invention are useful as 
reagents for differential identification of the tissue(s) or cell type(s) present in a 
biological sample and for diagnosis of diseases and conditions, which include, but are 
not limited to, thyroid dysfunction or cancer. Similarly, polypeptides and antibodies 

20 directed to these polypeptides are useful in providing immunological probes for 

differential identification of the tissue(s) or cell type(s). For a number of disorders of 
the above tissues or cells, particularly of the endocrine system, expression of this gene 
at significantly higher or lower levels may be routinely detected in certain tissues (e.g., 
cancerous and wounded tissues) or bodily fluids (e.g., serum, plasma, urine, synovial 

25 fluid or spinal fluid) or another tissue or cell sample taken from an individual having 

such a disorder, relative to the standard gene expression level, i.e., the expression level 
in healthy tissue or bodily fluid from an individual not having the disorder. Preferred 
epitopes include those comprising a sequence shown in SEQ ID NO: 333 as residues: 
Lys-107 to Leu- 124, Glu-150 to Thr-159, Pro- 173 to Asp- 179, Ser-192 to Ser-201. 

30 The tissue distribution indicates that polynucleotides and polypeptides 

corresponding to this gene are useful for diagnosis and treatment of thyroid dysfunction 
or cancer. 

FEATURES OF PROTEIN ENCODED BY GENE NO: 101 

35 This gene maps to chromosome 16, therefore polynucleotides of the invention 

can be used in linkage analysis as a marker for chromosome 16. In specific 
embodiments, polypeptides of the invention comprise the sequence: 
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1RHELTVLRDTRPACA (SEQ ID NO:585): and/or MDFXMALIYD (SEQ ID 
NO:586). Polynucleotides encoding these polypeptides are also encompassed by the 
invention. 

This gene is expressed primarily in kidney cortex and to a lesser extent in adult 
5 brain, corpus colosum, hippocampus, and frontal cortex. 

Therefore, polynucleotides and polypeptides of the invention are useful as 
reagents for differential identification of the tissue(s) or cell type(s) present in a 
biological sample and for diagnosis of diseases and conditions, which include, but are 
not limited to, neurological disorders. Similarly, polypeptides and antibodies directed to 
10 these polypeptides axe useful in providing immunological probes for differential 

identification of the tissue(s) or cell type(s). For a number of disorders of the above 
tissues or cells, particularly of the central nervous system, expression of this gene at 
significantly higher or lower levels may be routinely detected in certain tissues (e.g., 
cancerous and wounded tissues) or bodily fluids (e.g., serum, plasma, urine, synovial 
15 fluid or spinal fluid) or another tissue or cell sample taken from an individual having 
such a disorder, relative to the standard gene expression level, i.e., the expression level 
in healthy tissue or bodily fluid from an individual not having the disorder. 

The tissue distribution indicates that polynucleotides and polypeptides 
corresponding to this gene are useful for treatment or diagnosis of neurological 
20 disorders. 



FEATURES OF PROTEIN ENCODED BY GENE NO: 102 

In specific embodiments, polypeptides of the invention comprise the sequence: 
MQEMMRNQDRALSNLESIPGGYNA (SEQ ID NO:587); LRRMYTDIQEPMLSA 

25 AQEQF GGNPF (SEQ ID NO:588); ASLVSNTSSGEGSQPSRTENRDPLPNPWAP 
QT (SEQ ID NO:589); SQSSSASSGTASTVGGTTGSTASGTSGQSTTAPNLVPGV 
GASMFNTPG MQSLLQQITENPQLMQNMLSAPY (SEQ ID NO:590); 
MRSMMQSLSQNPDLAAQMMLNNPLFAGNPQLQEQMRQQLPTFLQQ (SEQ ID 
NO:591 ); MQNPDTLSAMSNPRAMQALLQIQQGLQTLATEAPGLIPGFTPGLG 

30 ALGSTGGSSGTNGSNATPSENTSPTAGT (SEQ ID NO:592); TEPGHQQFI 
QQMLQALAGVNPQLQNPEVRFQQQLEQLSAM 

IERLLGSQPS (SEQ ID NO:593); RNPAMMQEMMRNQDRALSNLESIPGGY 
NALRRMYTDIQEPMLSAA (SEQ ID NO:594); GNPFASLVSNTSS (SEQ ID 
NO:595); ENRDPLPNPWA (SEQ ID NO:595); GKJLKDQDTLSQHGIHD (SEQ ID 
35 NO:597); GLTVHLVIKTQNRP (SEQ ID NO:598); SELQSQMQRQLLSNPEMM 
(SEQ ID NO:599); PEISHMLNNPDIMR (SEQ ID NO:600); and/or 
RQLIMANPQMQQLIQRNP (SEQ ID NO:60R Polynucleotides encoding these 
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polypeptides are also encompassed by the invention. 
This gene is expressed primarily in breast. 

Therefore, polynucleotides and polypeptides of the invention are useful as 
reagents for differential identification of the tissue(s) or cell type(s) present in a 
5 biological sample and for diagnosis of diseases and conditions, which include, but are 
not limited to, breast cancer. Similarly, polypeptides and antibodies directed to these 
polypeptides are useful in providing immunological probes for differential identification 
of the tissue(s) or cell type(s). For a number of disorders of the above tissues or cells, 
particularly of tumor systems, expression of this gene at significantly higher or lower 

10 levels may be routinely detected in certain tissues (e.g., cancerous and wounded 

tissues) or bodily fluids (e.g., serum, plasma, urine, synovial fluid or spinal fluid) or 
another tissue or cell sample taken from an individual having such a disorder, relative to 
the standard gene expression level, i.e., the expression level in healthy tissue or bodily 
fluid from an individual not having the disorder. 

15 The tissue distribution indicates that polynucleotides and polypeptides 

corresponding to this gene are useful for treatment and diagnosis of some types of 
breast cancer. 

FEATURES OF PROTEIN ENCODED BY GENE NO: 103 

20 The translation product of this gene shares sequence homology with secreted 

serine proteases and lysozyme C precursor, which is thought to be important in 
bacteriolytic function. In specific embodiments, polypeptides of the invention comprise 
the sequence: NLCHVDCQDLLNPNLLAGIHCAKRIVS (SEQ ID NO:602); 
LDGFEG YSLS D WLCL AF VES KFN (SEQ ID NO:603); 

25 NENADGSFDYGLFQENSHYWCN (SEQ ID NO:604); and/or 

NLCHVDCQDLLNPNLLAGIHCAKRIVS (SEQ ID NO:605). Polynucleotides 
encoding these polypeptides are also encompassed by the invention. 
This gene is expressed primarily in testes. 

Therefore, polynucleotides and polypeptides of the invention are useful as 
30 reagents for differential identification of the tissue(s) or cell type(s) present in a 

biological sample and for diagnosis of diseases and conditions, which include, but are 
not limited to, infection. Similarly, polypeptides and antibodies directed to these 
polypeptides are useful in providing immunological probes for differential identification 
of the tissue(s) or cell type(s). For a number of disorders of the above tissues or cells, 
35 particularly of the immune system, expression of this gene at significantly higher or 
lower levels may be routinely detected in certain tissues (e.g., cancerous and wounded 
tissues) or bodily fluids (e.g.. serum, plasma, urine, synovial fluid or spinal fluid) or 
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another tissue or cell sample taken from an individual having such a disorder, relative to 
the standard gene expression level, i.e., the expression level in healthy tissue or bodily 
fluid from an individual not having the disorder. Preferred epitopes include those 
comprising a sequence shown in SEQ ID NO: 336 as residues: Ile-62 to Phe-70, Asn- 
5 78 to Asn-84. 

The tissue distribution and homology to lysozyme C precursor indicates that 
polynucleotides and polypeptides corresponding to this gene are useful for boosting the 
moncyte-macrophage system and enhance the activity of immunoagents. 

10 FEATURES OF PROTEIN ENCODED BY GENE NO: 104 

This gene is expressed primarily in apoptotic T-cell. 
Therefore, polynucleotides and polypeptides of the invention are useful as 
reagents for differential identification of the tissuc(s) or cell type(s) present in a 
biological sample and for diagnosis of diseases and conditions, which include, but are 

15 not limited to, immune disorders. Similarly, polypeptides and antibodies directed to 
these polypeptides are useful in providing immunological probes for differential 
identification of the tissue(s) or cell type(s). For a number of disorders of the above 
tissues or cells, particularly of the immune system, expression of this gene at 
significantly higher or lower levels may be routinely detected in certain tissues (e.g., 

20 cancerous and wounded tissues) or bodily fluids (e.g., serum, plasma, urine, synovial 
fluid or spinal fluid) or another tissue or cell sample taken from an individual having 
such a disorder, relative to the standard gene expression level, i.e., the expression level 
in healthy tissue or bodily fluid from an individual not having the disorder. 

The tissue distribution indicates that polynucleotides and polypeptides 

25 corresponding to this gene are useful for treatment and diagnosis of some immune 
disorders. 

FEATURES OF PROTEIN ENCODED BY GENE NO: 105 

The translation product of this gene shares sequence homology with ARI 
30 protein of Drosophila (accession 2058299; EMBL: locus DMARJADNE, accession 

X98309), which is thought to be important in axonal path-finding in the central nervous 
system. In specific embodiments, polypeptides of the invention comprise the sequence 
IREVNEVIQNPAT (SEQ ID NO:606); ITRILLSHFNWDKEKLMERYF 
DGNLEKLFA (SEQ ID NO:607); NTRSSAQDMPCQICYLNYPNSYF (SEQ ID 
35 NO:608): TGLECGHKFCMQCWSEYLTTKIMEEGMGQTISCPAHG (SEQ ID 
NO:614);CDILVDDNTVMRLrTDSK^ 

CHHVVKVQYPDAKPV (SEQ ID NO.609); CDILVDDNTVMRLITDSK 
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VKLKYQHLITNSFVECNRLLKWCPAPDCHHVVKV (SEQ ID NO:610); 
GCNHMVCRNQNCKAEFCWVCLGPWEPHGSAWYNCNRYNEDDAKAARDAQE 
RSRAAEQRYL (SEQ ID NO:61 1); FYCNRYMNHMQSLRFEHKLYAQVKQ 
KMEEMC^HNMSWIEVQFLKKAVDVLCQCRATLMYT (SEQ ID NO: 612): 
5 YVFAFYLKKNNQSCFENNQADLENATEVLSGYLERDISQDSLQDIKQKVQDKY 
RYCESR (SEQ ID NO:6I3) Polynucleotides encoding these polypeptides are also 
encompassed by the invention. 

This gene is expressed primarily in adult brain, and to a lesser extent in 
endometrial tumor, melanocytes, and infant brain. 

10 Therefore, polynucleotides and polypeptides of the invention are useful as 

reagents for differential identification of the tissue(s) or cell type(s) present in a 
biological sample and for diagnosis of diseases and conditions, which include, but are 
not limited to, diseases or injuries involving axonal path development. Similarly, 
polypeptides and antibodies directed to these polypeptides are useful in providing 

15 immunological probes for differential identification of the tissue(s) or cell type(s). For 
a number of disorders of the above tissues or cells, particularly of the central nervous 
system, expression of this gene at significantly higher or lower levels may be routinely 
detected in certain tissues (e.g., cancerous and wounded tissues) or bodily fluids (e.g., 
serum, plasma, urine, synovial fluid or spinal fluid) or another tissue or cell sample 

20 taken from an individual having such a disorder, relative to the standard gene 

expression level, i.e., the expression level in healthy tissue or bodily fluid from an 
individual not having the disorder. 

The tissue distribution and homology to ARI protein indicates that 
polynucleotides and polypeptides corresponding to this gene are useful for treatment of 

25 disease states or injuries involving axonal path development, including 
neurodegenerative diseases and nerve injury. 

FEATURES OF PROTEIN ENCODED BY GENE NO: 106 

The translation product of this gene shares sequence homology with cytochrome 
30 b56l [Sus scrofa] which is thought to be an integral membrane protein of 
neuroendocrine storage vesicles of neurotransmitters and peptide hormones. 

This gene is expressed primarily in frontal cortex and to a lesser extent in 
rhabdomyosarcoma. 

Therefore, polynucleotides and polypeptides of the invention are useful as 
35 reagents for differential identification of the tissue(s) or cell type(s) present in a 

biological sample and for diagnosis of diseases and conditions, which include, but are 
not limited to, neurological disorders. Similarly, polypeptides and antibodies directed to 
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these polypeptides are useful in providing immunological probes for differential 
identification of the tissue(s) or cell type(s). For a number of disorders of the above 
tissues or ceils, particularly of the central nervous system, expression of this gene at 
significantly higher or lower levels may be routinely detected in certain tissues (e.g., 
5 cancerous and wounded tissues) or bodily fluids (e.g., serum, plasma, urine, synovial 
fluid or spinal fluid) or another tissue or cell sample taken from an individual having 
such a disorder, relative to the standard gene expression level, i.e., the expression level 
in healthy tissue or bodily fluid from an individual not having the disorder. Preferred 
epitopes include those comprising a sequence shown in SEQ ID NO: 339 as residues: 
10 Ser-18 to Pro-24. 

The tissue distribution and homology to cytochrome b561 [Sus scrofa] indicates 
that polynucleotides and polypeptides corresponding to this gene are useful for 
treatment and diagnosis of neurological disorders. This gene may also be important in 
regulation of some types of cancers. 

15 

FEATURES OF PROTEIN ENCODED BY GENE NO: 107 

In specific embodiments, polypeptides of the invention comprise the sequence: 
MWGYLFVDAAWNFLGCLICGW (SEQ ID NO:615); MHFISSGNVSAIRSSILLL 
RXSLSYLGNCLRVSAIFVYFLLFLLLS (SEQ ID NO:616); and/or MDQALRGSPSE 
20 GFSTDPSPPQVGRQIPSFPPWRRLVLPKASGCFLEREWWLCVFKJLRTRPGAEA 
HAYNSSILGGRGKGIT (SEQ ID NO:617). Polynucleotides encoding these 
polypeptides are also encompassed by the invention. 

This gene is expressed primarily in pancreas tumor and to a lesser extent in 
cerebellum. 

25 Therefore, polynucleotides and polypeptides of the invention are useful as 

reagents for differential identification of the tissue(s) or cell type(s) present in a 
biological sample and for diagnosis of diseases and conditions, which include, but are 
not limited to, pancreatic tumors. Similarly, polypeptides and antibodies directed to 
these polypeptides are useful in providing immunological probes for differential 

30 identification of the tissue(s) or cell type(s). For a number of disorders of the above 
tissues or cells, particularly of the endocrine system, expression of this gene at 
significantly higher or lower levels may be routinely detected in certain tissues (e.g., 
cancerous and wounded tissues) or bodily fluids (e.g., serum, plasma, urine, synovial 
fluid or spinal fluid) or another tissue or cell sample taken from an individual having 

35 such a disorder, relative to the standard gene expression level, i.e.. the expression level 
in healthy tissue or bodily fluid from an individual not having the disorder. Preferred 



WO 98/54963 



'CT/US98/I1422 



S3 

epitopes include those comprising a sequence shown in SEQ ID NO: 340 as residues: 
Pro-22 to Phe-33. 

The tissue distribution indicates that polynucleotides and polypeptides 
corresponding to this gene are useful for diagnosis and treatment of pancreatic tumors. 

5 

FEATURES OF PROTEIN ENCODED BY GENE NO: 108 

This gene maps to chromosome 17 and therefore polynucleotides of the 
invention can be used in linkage analysis as a marker for chromosome 17. In specific 
embodiments, polypeptides of the invention comprise the sequence: 
10 MLPALASCCHFSPPEQAARLKKLQEQEKQQKVEFRKRMEKEVSDFIQDSGQ1K 
KKPQPMNKJERSILHDVVEVAGLTSFSFGEDDDCRYVMIFKKEFAPSDEELDSY 
RRGEEWDPQKAEEKRNXKELAQRQ (SEQ ID NO:618); EEEAAQQGPVVV 

SPASDYKI)KYSHLIGKGAAKDAAHMLQANKTYGCXPVANKRDTRSIEEAMNE 
IRAKKRLRQSGE (SEQ ID NO:619); PPRRPAQLPLTPGAGQGAGRDKAAAIRA 

15 HPGAPPLNHLLP (SEQ IDNO:620); AVPQAGGKQVFDLSPLELGYVRGMCVCV 
(SEQ ID NO:621) anoVor MLPALASCCHFSPPEQAARLKKLQEQEKQQKVEFRK 
RMEKEVSDnQDSGQIKKKFQPMNKffiRSILHDVVEVAGLTSFSFGEDDDCRYV 
MIFKKEFAPSDEELDSYRRGEEWDPQKAEEKRNXKELAQRQEEEAAQQGPVVV 
SPASDYKDKYSHLIGKGAAKDAAHMLQANKTYGCXPVANKRDTRSIEEAMNE 

20 IRAKKRLRQSGE (SEQ ID NO:622). Polynucleotides encoding these polypeptides 
are also encompassed by the invention. The translation product of this gene shares 
sequence homology with FSA-1 which may play a role as a structural protein 
component of the acrosome. 

This gene is expressed primarily in fetal kidney and sperm. 

25 Therefore, polynucleotides and polypeptides of the invention are useful as 

reagents for differential identification of the tissue(s) or cell type(s) present in a 
biological sample and for diagnosis of diseases and conditions, which include, but are 
not limited to, male reproductive disorders, especially involving acrosomal disfunction. 
Similarly, polypeptides and antibodies directed to these polypeptides are useful in 

30 providing immunological probes for differential identification of the tissue(s) or cell 

type(s). For a number of disorders of the above tissues or cells, particularly of the male 
reproductive system, expression of this gene at significantly higher or lower levels may 
be routinely detected in certain tissues (e.g., cancerous and wounded tissues) or bodily 
fluids (e.g., serum, plasma, urine, synovial fluid or spinal fluid) or another tissue or 

35 cell sample taken from an individual having such a disorder, relative to the standard 

gene expression level, i.e., the expression level in healthy tissue or bodily fluid from an 
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individual not having the disorder. Preferred epitopes include those comprising a 
sequence shown in SEQ ID NO: 341 as residues: Glu-8 to Asn-35. 

The tissue distribution and homology to FSA-1 indicates that polynucleotides 
and polypeptides corresponding to this gene are useful for treatment of infertility due to 
5 acrosomal disfunction of sperm. 

FEATURES OF PROTEIN ENCODED BY GENE NO: 109 

This gene is expressed primarily in pituitary and to a lesser extent in 
epididymus. 

10 Therefore, polynucleotides and polypeptides of the invention are useful as 

reagents for differential identification of the tissue(s) or cell type(s) present in a 
biological sample and for diagnosis of diseases and conditions, which include, but are 
not limited to, male reproductive disorders. Similarly, polypeptides and antibodies 
directed to these polypeptides are useful in providing immunological probes for 

1 5 differential identification of the tissue(s) or cell type(s). For a number of disorders of 
the above tissues or cells, particularly of the male reproductive system, expression of 
this gene at significantly higher or lower levels may be routinely detected in certain 
tissues (e.g., cancerous and wounded tissues) or bodily fluids (e.g., serum, plasma, 
urine, synovial fluid or spinal fluid) or another tissue or cell sample taken from an 

20 individual having such a disorder, relative to the standard gene expression level, i.e., 
the expression level in healthy tissue or bodily fluid from an individual not having the 
disorder. Preferred epitopes include those comprising a sequence shown in SEQ ID 
NO: 342 as residues: Met-1 to Trp-6. 

Because the gene is found in both pituitary and epididymus, this indicates that 

25 polynucleotides and polypeptides corresponding to this gene are useful for treatment 
and diagnosis of male reproductive disorders. This may involve a secreted peptide 
produced in the pituitary targeting the epididymus. 

FEATURES OF PROTEIN ENCODED BY GENE NO: 110 

30 In specific embodiments, polypeptides of the invention comprise the sequence: 

LLCPVLNSGXSWNFPHPSQPEYSFHGFHSTRLWI (SEQ ID NO:623); and/or 
PSTPWFLFLLGLTCPFSTSHPRWDSIPP (SEQ ID NO:624). Polynucleotides 
encoding these polypeptides are also encompassed by the invention. 
This gene is expressed primarily in resting T-cells. . 

35 Therefore, polynucleotides and polypeptides of the invention are useful as 

reagents for differential identification of the tissue(s) or cell type(s) present in a 
biological sample and for diagnosis of diseases and conditions, which include, but are 
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not limited to, T-cell disorders. Similarly, polypeptides and antibodies directed to these 
polypeptides are useful in providing immunological probes for differential identification 
of the tissue(s) or cell type(s). For a number of disorders of the above tissues or cells, 
particularly of the immune system, expression of this gene at significantly higher or 
5 lower levels may be routinely detected in certain tissues (e.g., cancerous and wounded 
tissues) or bodily fluids (e.g., serum, plasma, urine, synovial fluid or spinal fluid) or 
another tissue or cell sample taken from an individual having such a disorder, relative to 
the standard gene expression level, i.e., the expression level in healthy tissue or bodily 
fluid from an individual not having the disorder. 
10 The tissue distribution indicates that polynucleotides and polypeptides 

corresponding to this gene are useful for treatment and diagnosis of certain immune 
disorders, especially those involving T-cells. 

FEATURES OF PROTEIN ENCODED BY GENE NO: 111 

15 This gene is expressed primarily in cerebellum and whole brain and to a lesser 

extent in infant brain and fetal kidney. 

Therefore, polynucleotides and polypeptides of the invention are useful as 
reagents for differential identification of the tissue(s) or cell type(s) present in a 
biological sample and for diagnosis of diseases and conditions, which include, but are 

20 not limited to, neurological disorders. Similarly, polypeptides and antibodies directed to 
these polypeptides are useful in providing immunological probes for differential 
identification of the tissue(s) or cell type(s). For a number of disorders of the above 
tissues or cells, particularly of the central nervous system, expression of this gene at 
significantly higher or lower levels may be routinely detected in certain tissues (e.g., 

25 cancerous and wounded tissues) or bodily fluids (e.g., serum, plasma, urine, synovial 
fluid or spinal fluid ) or another tissue or cell sample taken from an individual having 
such a disorder, relative to the standard gene expression level, i.e., the expression level 
in healthy tissue or bodily fluid from an individual not having the disorder. Preferred 
epitopes include those comprising a sequence shown in SEQ ID NO: 344 as residues: 

30 Asp-48 toGly-55. 

The tissue distribution indicates that polynucleotides and polypeptides 
corresponding to this gene are useful for diagnosis and treatment of neurological 
disorders. 



35 



FEATURES OF PROTEIN ENCODED BY GENE NO: 112 

The translation product of this gene shares sequence homology with yeast 
mitochondrial ribosomal protein homologous to ribosomal protein si 5 of E.coli which 
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is thought to be important in the early assembly of ribosomes (See Accession No. 
M38016). This gene maps to chromosome 1. and therefore, may be used as a marker 
in linkage analysis for chromosome 1 . 

This gene is expressed primarily in developmental tissues. 
5 Therefore, polynucleotides and polypeptides of the invention are useful as 

reagents for differential identification of the tissue(s) or cell type(s) present in a 
biological sample and for diagnosis of diseases and conditions, which include, but are 
not limited to, development of cancers and tumors in addition to healing wounds. 
Similarly, polypeptides and antibodies directed to these polypeptides are useful in 

10 providing immunological probes for differential identification of the tissue(s) or cell 
type(s). For a number of disorders of the above tissues or cells, particularly of the 
immune and developmental expression of this gene at significantly higher or lower 
levels may be routinely detected in certain tissues (e.g., cancerous and wounded 
tissues) or bodily fluids (e.g., serum, plasma, urine, synovial fluid or spinal fluid) or 

1 5 another tissue or cell sample taken from an individual having such a disorder, relative to 
the standard gene expression level, i.e., the expression level in healthy tissue or bodily 
fluid from an individual not having the disorder. 

The tissue distribution and homology to ribosomalprotein s 15 of E. coli 
indicates that polynucleotides and polypeptides corresponding to this gene are useful for 

20 diseases related to the assembly of ribosomes in the mitochondria which is important in 
the translation of RNA into protein. Therefore, this indicates that polynucleotides and 
polypeptides corresponding to this gene are useful for diagnosis and intervention of 
multiple tumors as well as in healing wounds which are thought to be under similar 
regulation as developmental tissues. Protein, as well as, antibodies directed against the 

25 protein have utility as tumor markers, in addition to immunotherapy targets, for the 
above listed tumors and tissues. 

FEATURES OF PROTEIN ENCODED BY GENE NO: 113 

The translation product of this gene shares sequence homology with human 
30 poliovirus receptor precursors which are thought to be important in viral binding and 
uptake. Preferred polypeptide fragments comprise the following amino acid sequence: 
ELSISISNVALADEGEYTCSIFTMPVRTAKSLVTVLGIPQKPIITGYKSSLREKDT 
ATLNCQSSGSKPAARLTWRKGDQELHGEPTRIQEDPNGKTFTVSSSVTFQVTR 
EDE)GASIVCSVNHESLKGADRSTSQRIEVLYTPTAMIRPDPPHPREGQKLLLHC 
35 EGRGNPVPQQYLWEKEGSVPPLKMTQESALrFPFLNKSDSGTYGCTATSNMGS 
YKAYYTLNVND (SEQ ID NO:625). Also preferred are polynucleotide fragments 
encoding these polypeptide fragments (See Accession No. gnllPIDId 1002627). 
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This gene is expressed almost exclusively in human brain tissue. 
Therefore, polynucleotides and polypeptides of the invention are useful as 
reagents for differential identification of the tissue(s) or cell type(s) present in a 
biological sample and for diagnosis of diseases and conditions, which include, but are 
5 not limited to, susceptibility to viral disease and diseases of the CNS especially cancers 
of that system. Similarly, polypeptides and antibodies directed to these polypeptides are 
useful in providing immunological probes for differential identification of the tissue(s) 
or cell type(s). For a number of disorders of the above tissues or cells, particularly of 
the central nervous system, expression of this gene at significantly higher or lower 

10 levels may be routinely detected in certain tissues (e.g., cancerous and wounded 

tissues) or bodily fluids (e.g., serum, plasma, urine, synovial fluid or spinal fluid) or 
another tissue or cell sample taken from an individual having such a disorder, relative to 
the standard gene expression level, i.e., the expression level in healthy tissue or bodily 
fluid from an individual not having the disorder. Preferred epitopes include those 

1 5 comprising a sequence shown in SEQ ID NO: 346 as residues: Leu-26 to Asp-37, Lys- 
53 to Ser-59. 

The tissue distribution and homology to poliovirus receptor precursors indicates 
that polynucleotides and polypeptides corresponding to this gene are useful for the 
treatment and prevention of diseases that involve the binding and uptake of virus 
20 particles for infection. It might also be helpful in genetic therapy where the goal is to 
insert foreign DNA into infected cells. With the help of this protein, the binding and 
uptake of this foreign DNA might be aided. In addition, it is expected that over 
expression of this gene will indicate abnormalities involving the CNS, particularly 
cancers of that system. 

25 

FEATURES OF PROTEIN ENCODED BY GENE NO: 114 

The translation product of this gene shares sequence homology with 
Y087_CAEEL hypothetical 28.5 KD protein ZK1236.7 in chromosome III of 
Caenorhabditis elegans in addition to alpha- 1 collagen type III (See Accession No. 

30 gil537432). One embodiment for this gene is the polypeptide fragment(s) comprising 
the following amino acid sequence: VPELPDRVHQLHQAVQGCALGRPGFPGGPTH 
SGHHKSHPGPAGGDYNRCDRPGQVHLHNPRGTGRRGQLHPTAGPGVHRRA 
CPSQQLPHRLGPGVPCPSPSLTPVLPSWTQSWCG LPGYTSSS (SEQ ID 
NO:630). An additional embodiment is the polynucleotide fragment(s) encoding these 

35 polypeptide fragments 

This gene is expressed primarily in brain cells and to a lesser extent in activated 
B and T cells. 
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Therefore, polynucleotides and polypeptides of the invention are useful as 
reagents for differential identification of the tissue(s) or cell type(s) present in a 
biological sample and for diagnosis of diseases and conditions, which include, but are 
not limited to, neurodegeneration and imunological disorders. Similarly, polypeptides 
5 and antibodies directed to these polypeptides are useful in providing immunological 
probes for differential identification of the tissue(s) or cell type(s). For a number of 
disorders of the above tissues or cells, particularly of the neural and immune systems, 
expression of this gene at significantly higher or lower levels may be routinely detected 
in certain tissues (e.g., cancerous and wounded tissues) or bodily fluids (e.g., serum, 

10 plasma, urine, synovial fluid or spinal fluid) or another tissue or cell sample taken from 
an individual having such a disorder, relative to the standard gene expression level, i.e., 
the expression level in healthy tissue or bodily fluid from an individual not having the 
disorder. Preferred epitopes include those comprising a sequence shown in SEQ ID 
NO: 347 as residues: Glu-34 to Glu-39, Gly-51 to Ser-72, Ala-88 to Glu-93, Gin- 100 

15 to Val-105. 

The tissue distribution and homology to Y087_CAEEL hypothetical 28.5 KD 
protein ZK 1236.7 in chromosome III of Caenorhabditis elegans as well as to a 
conserved alpha- 1 collagen type III protein indicates that polynucleotides and 
polypeptides corresponding to this gene are useful for the detection and treatment of 

20 neurodegenerative disease states and behavioral disorders such as Alzheimer's Disease, 
Parkinson's Disease, Huntingtons' Disease, schizophrenia, mania, dementia, paranoia, 
obsessive compulsive disorder and panic disorders. Because the gene is expressed in 
cells of lymphoid origin, the natural gene product may be involved in immune 
functions. Therefore it may be also used as an agent for immunological disorders 

25 including arthritis, asthma, immune deficiency diseases such as AIDS, and leukemia. 

FEATURES OF PROTEIN ENCODED BY GENE NO: 115 

The translation product of this gene shares sequence homology with alpha 3 
type IX collagen which is thought to be important in hyaline cartilage formation via its 

30 ability to uptake inorganic sulfate by cells (See Accession No. gil975657). One 

embodiment of this gene is the polypeptide fragment comprising the following amino 
acid sequence: SLRRPRSAAXQTLTTFLSSVSSASSSALPGSREPCDPRAPPPPR 
SGSAASCCSCCCSCPRRRAPLRSPRGSKRR1RQREVVDLYNGMCLQGPAGVPG 
RDGSPGANGIPGTPGIPGRDGFKGEKGECLRESFEESWTPNYKQCSWSSLNY 

35 GIDLGKJAECTFTKMRSNSALRVLFSGSLRLKCRNACCQRWYTTFNGAECSGP 
LPIEAIIYLDQGSPEMNSTINIHRTSSVEGLCEGIGAGLVDVAIWVGTCSDYPKG 
DASTGWNSVSRIIIEELPK (SEQ ID NO:634). An additional embodiment are the 
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polynucleotide fragments encoding this polypeptide fragment. 

This gene is expressed primarily in smooth muscle and to a lesser extent in 
synovial tissue. 

Therefore, polynucleotides and polypeptides of the invention are useful as 
5 reagents for differential identification of the tissue(s) or cell type(s) present in a 

biological sample and for diagnosis of diseases and conditions, which include, but are 
not limited to, dwarfism, spinal deformation, and specific joint abnormalities as well as 
chondrodysplasias i.e., spondyloepiphyseal dysplasia congenita, familial osteoarthritis, 
Atelosteogenesis type II, metaphyseal chondrodysplasia type Schmid and autoimmune 

10 disorders . Similarly, polypeptides and antibodies directed to these polypeptides are 
useful in providing immunological probes for differential identification of the tissue(s) 
or cell type(s). For a number of disorders of the above tissues or cells, particularly of 
the skeletal system, expression of this gene at significantly higher or lower levels may 
be routinely detected in certain tissues (e.g., cancerous and wounded tissues) or bodily 

15 fluids (e.g., serum, plasma, urine, synovial fluid or spinal fluid) or another tissue or 
cell sample taken from an individual having such a disorder, relative to the standard 
gene expression level, i.e., the expression level in healthy tissue or bodily fluid from an 
individual not having the disorder. 

The tissue distribution and homology to alpha 3 type IX collagen indicates that 

20 polynucleotides and polypeptides corresponding to this gene are useful for the treatment 
and diagnosis of diseases associated with the mutation in this gene which leads to the 
many different types of chondrodysplasias. By the use of this product, the abnormal 
growth and development of bones of the limbs and spine could be routinely detected or 
treated in utero since the protein or muteins thereof could affect epithelial cells early in 

25 development and later the chondrocytes of the developing craniofacial structure. 

FEATURES OF PROTEIN ENCODED BY GENE NO: 116 

The translation product of this gene shares sequence homology with retrovirus- 
related reverse transcriptase which is thought to be important in viral replication. One 

30 embodiment for this gene is the polypeptide fragments comprising the following amino 
acid sequence: TKKENCRPASLMNIDTK1LNKILMNQ (SEQ ID NO:640). An 
additional embodiment is the polynucleotide fragments encoding these polypeptide 
fragments (See Accession No. pirlA25313IGNHULl). 

This gene is expressed primarily in human meningima. 

35 Therefore, polynucleotides and polypeptides of the invention are useful as 

reagents for differential identification of the tissue(s) or cell type(s) present in a 
biological sample and for diagnosis of diseases and conditions, which include, but are 
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not limited to, retroviral diseases such as AIDS, and possibly certain cancers due to 
transactivation of latent cell division genes. Similarly, polypeptides and antibodies 
directed to these polypeptides are useful in providing immunological probes for 
differential identification of the tissue(s) or cell type(s). For a number of disorders of 
5 the above tissues or cells, particularly of the immune system, expression of this gene at 
significantly higher or lower levels may be routinely detected in certain tissues (e.g., 
cancerous and wounded tissues) or bodily fluids (e.g., scrum, plasma, urine, synovial 
fluid or spinal fluid) or another tissue or cell sample taken from an individual having 
such a disorder, relative to the standard gene expression level, i.e., the expression level 

10 in healthy tissue or bodily fluid from an individual not having the disorder. 

The tissue distribution and homology to retrovirus-related reverse transcriptase 
indicates that polynucleotides and polypeptides corresponding to this gene are useful for 
the detection and treatment of diseases and maladies associated with retroviral infection 
since a functional reverse transcriptase (RT) or RT-like molecule is an integral 

15 component of the retroviral life cycle. 

FEATURES OF PROTEIN ENCODED BY GENE NO: 117 

The translation product of this gene shares sequence homology with an 
unknown gene from C. elegans, as well as weak homolog with mammalian metaxin, a 

20 gene contiguous to both thrombospondin 3 and glucocerebrosidase, is known to be 
required for embryonic development. Preferred polypeptide fragments comprise the 
following amino acid sequence: MCNLPIKVVCRANAEYMSPSGKVPXXHVGNQ 
VVSELGPIVQFVKAKGHSLSDGLEEVQKAEMKAYMELVNNMLLTAELYLQWC 
DEATVGXITHXRYGSPYPWPLXHILAYQKQWEVKRKXKAIGWGKKTLDQVLE 

25 DVDQCCQALSQRLGTQPYFFNKQPTELDALVFGHLYTILTTQLTNDELSEKVKN 
YSNLLAFCRRI EQHYFEDRGKGRLS (SEQ ID NO:641); MCNLPIKVVCRANAE 
YMSPSGKVPXXHVGNQVVSELGPIVQFVK (SEQ ID NO:642),. Also preferred are 
polynucleotide fragments encoding these polypeptide fragments (See Accession No. 
gill326108). 

30 This gene is expressed primarily in fetal tissues and to a lesser extent in 

hematopoietic cells and tissues, including spleen, monocytes, and T cells. 

Therefore, polynucleotides and polypeptides of the invention are useful as 
reagents for differential identification of the tissue(s) or cell type(s) present in a 
biological sample and for diagnosis of diseases and conditions, which include, but are 

35 not limited to, cancer; lymphoproliferative disorders; inflammation; chondrosarcoma, 
and Gaucher disease. Similarly, polypeptides and antibodies directed to these 
polypeptides are useful in providing immunological probes for differential identification 
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of the tissue(s) or cell typc(s). For a number of disorders of the above tissues or cells, 
particularly of the hematopoietic and embryonic systems, expression of this gene at 
significantly higher or lower levels may be routinely detected in cenain tissues (e.g., 
cancerous and wounded tissues) or bodily fluids (e.g., serum, plasma, urine, synovial 
5 fluid or spinal fluid) or another tissue or cell sample taken from an individual having 
such a disorder, relative to the standard gene expression level, i.e., the expression level 
in healthy tissue or bodily fluid from an individual not having the disorder. 

The tissue distribution indicates that polynucleotides and polypeptides 
corresponding to this gene are useful for the diagnosis and treatment of cancer and other 

10 proliferative disorders. Expression in embryonic tissue and other cellular sources 

marked by proliferating cells indicates that this protein may play a role in the regulation 
or cellular division. Additionally, the expression in hematopoietic cells and tissues 
indicates that this protein may play a role in the proliferation, differentiation, and 
survival of hematopoietic cell lineages. Thus, this gene may be useful in the treatment 

15 of lymphoproliferative disorders, and in the maintenance and differentiation of various 
hematopoietic lineages from early hematopoietic stem and committed progenitor cells. 

FEATURES OF PROTEIN ENCODED BY GENE NO: 118 

The translation product of this gene shares sequence homology with reverse 
20 transcriptase which is important in the synthesis of a cDNA chain from an RNA 
molecule, and is a method whereby the infecting RNA chains of retroviruses are 
transcribed into their DNA complements. One embodiment for this gene is the 
polypeptide fragment comprising the following amino acid sequence: 

MXXXNSHITIFTLNVNGLNAPNERHRLANWIQSQDQVCCIQETHLTGRDTHRL 
25 KIKGWRKIYQANGKQKK (SEQ ID NO:647). An additional embodiment is the 
polynucleotide fragments comprising polynucleotides encoding these polypeptide 
fragments (See Accession No. gil2072964). 

This gene is expressed primarily in skin and to a lesser extent in neutrophils. 
Therefore, polynucleotides and polypeptides of the invention are useful as 
30 reagents for differential identification of the tissue(s) or cell type(s) present in a 

biological sample and for diagnosis of diseases and conditions which include, but are 
not limited to, cancer, hematopoietic disorders; inflammation; disorders of immune 
surveillance. Similarly, polypeptides and antibodies directed to these polypeptides are 
useful in providing immunological probes for differential identification of the tissue(s) 
35 or cell type(s). For a number of disorders of the above tissues or cells, particularly of 
the epidermis and/or hematopoietic system, expression of this gene at significantly 
higher or lower levels may be routinely detected in certain tissues (e.g., cancerous and 
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wounded tissues) or bodily fluids (e.g., serum, plasma, urine, synovial fluid or spinal 
fluid) or another tissue or cell sample taken from an individual having such a disorder, 
relative to the standard gene expression level, i.e., the expression level in healthy tissue 
or bodily fluid from an individual not having the disorder. 
5 The tissue distribution and homology to reverse transcriptase indicates that 

polynucleotides and polypeptides corresponding to this gene are useful for cancer 
therapy. Expression in the skin also indicates that this gene is useful in wound healing 
and fibrosis. Expression by neutrophils also indicates that this gene product plays a role 
in inflammation and the control of immune surveillance (i.e. recognition of viral 
10 pathogens). Reverse transcriptase family members are also useful in the detection and 
treatment of AIDS. 

FEATURES OF PROTEIN ENCODED BY GENE NO: 119 

The translation product of this gene shares sequence homology with reverse 
1 5 transenptase which is important in the synthesis of a cDNA copy of an RN A molecule, 
and is a method whereby a retrovirus reverse-transcribes its genome into an inheritable 
DNA copy. 

This gene is expressed primarily in the frontal cortex of brain. 
Therefore, polynucleotides and polypeptides of the invention are useful as 

20 reagents for differential identification of the tissue(s) or cell type(s) present in a 

biological sample and for diagnosis of diseases and conditions, which include, but are 
not limited to, cancer and neurodegenerative disorders. Similarly, polypeptides and 
antibodies directed to these polypeptides are useful in providing immunological probes 
for differential identification of the tissue(s) or cell type(s). For a number of disorders 

25 of the above tissues or cells, particularly of the CNS and peripheral nervous system, 
expression of this gene at significantly higher or lower levels may be routinely detected 
in certain tissues (e.g., cancerous and wounded tissues) or bodily fluids (e.g., serum, 
plasma, urine, synovial fluid or spinal fluid) or another tissue or cell sample taken from 
an individual having such a disorder, relative to the standard gene expression level, i.e., 

30 the expression level in healthy tissue or bodily fluid from an individual not having the 
disorder. 

The tissue distribution and homology to reverse transcriptase suggest that this is 
useful in the treatment of cancer and AIDS. The expression in brain indicates that it 
plays a role in neurodegenerative disorders and in neural degeneration. 



35 
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FEATURES OF PROTEIN ENCODED BY GENE NO: 120 

One embodiment of this gene has homology to a hypothetical protein in 
Schizosaccharomyces pombe (See Accession No. 2281980). Another embodiment tor 
this gene is the polypeptide fragments comprising the following amino acid sequence: 

5 IYHLHSWIFFHFKRAFCMCFITMKVIHAHCSKLRKCXNAQISVFCTTLTASYPT 
(SEQ ID NO:651). An additional embodiment is the polynucleotide fragments 
encoding these polypeptide fragments. This gene maps to chromosome 18, and 
therefore, may be used as a marker in linkage analysis for chromosome 18. 

This gene is expressed primarily in adult hypothalamus and to a lesser extent in 
10 infant brain. 

Therefore, polynucleotides and polypeptides of the invention are useful as 
reagents for differential identification of the tissue(s) or cell type(s) present in a 
biological sample and for diagnosis of diseases and conditions, which include, but are 
not limited to, neurodegenerative disorders; endocrine function; and vertigo. Similarly, 

1 5 polypeptides and antibodies directed to these polypeptides are useful in providing 

immunological probes for differential identification of the tissue(s) or cell type(s). For 
a number of disorders of the above tissues or cells, particularly of the brain, CNS and 
peripheral nervous system, expression of this gene at significantly higher or lower 
levels may be routinely detected in certain tissues (e.g., cancerous and wounded 

20 tissues) or bodily fluids (e.g., serum, plasma, urine, synovial fluid or spinal fluid) or 
another tissue or cell sample taken from an individual having such a disorder, relative to 
the standard gene expression level, i.e., the expression level in healthy tissue or bodily 
fluid from an individual not having the disorder. 

The tissue distribution indicates that polynucleotides and polypeptides 

25 corresponding to this gene are useful for the treatment and diagnosis of 

neurodegenerative disorders; diagnosis of tumors of a brain or neuronal origin; 
treatments involving hormonal control of the entire body and of homeostasis, 
behavioral disorders, such as Alzheimer's Disease, Parkinson's Disease, Huntington's 
Disease, schizophrenia, mania, dementia, paranoia, obsessive compulsive disorder and 

30 panic disorder. In addition, the gene or gene product may also play a role in the 

treatment and/or detection of developmental disorders associated with the developing 
embryo. 

FEATURES OF PROTEIN ENCODED BY GENE NO: 121 

35 The translation product of this gene shares sequence homology with the human 

IRLB protein which is thought to be important in binding to a c-myc promoter element 
and thus regulating its transcription (See Accession No. gi!33969). This gene maps to 
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chromosome 1 , and therefore, may be used as a marker in linkage analysis for 
chromosome I. 

This gene is expressed primarily in brain and breast and to a lesser extent in a 
variety of hematopoietic tissues and cells. 
5 Therefore, polynucleotides and polypeptides of the invention are useful as 

reagents for differential identification of the tissue(s) or cell type(s) present in a 
biological sample and for diagnosis of diseases and conditions, which include, but are 
not limited to, cancer of the brain and breast; lymphoproliferative disorders; 
neurodegenerative diseases. Similarly, polypeptides and antibodies directed to these 

10 polypeptides are useful in providing immunological probes for differential identification 
of the tissue(s) or cell type(s). For a number of disorders of the above tissues or cells, 
particularly of the CNS, breast, and immune system, expression of this gene at 
significantly higher or lower levels may be routinely detected in certain tissues (e.g., 
cancerous and wounded tissues) or bodily fluids (e.g., serum, plasma, urine, synovial 

1 5 fluid or spinal fluid) or another tissue or cell sample taken from an individual having 
such a disorder, relative to the standard gene expression level, i.e., the expression level 
in healthy tissue or bodily fluid from an individual not having the disorder. 

The tissue distribution indicates that polynucleotides and polypeptides 
corresponding to this gene are useful for the treatment and diagnosis of cancer of the 

20 brain, breast, and hematopoietic system. In addition, it may be useful for the treatment 
of neurodegenerative disorders, as well as disorders of the hematopoietic system, 
including defects in immune competency and inflammation. Protein, as well as, 
antibodies directed against the protein may show utility as a tumor marker and 
immunotherapy targets for the above listed tumors and tissues. 

25 

FEATURES OF PROTEIN ENCODED BY GENE NO: 122 

The translation product of this gene shares sequence homology with an ATP 
synthase, a key component of the proton channel that is thought to be important in the 
translocation of protons across the membrane. 

30 This gene is expressed primarily in T cell lymphoma. 

Therefore, polynucleotides and polypeptides of the invention are useful as 
reagents for differential identification of the tissue(s) or cell type(s) present in a 
biological sample and for diagnosis of diseases and conditions, which include, but are 
not limited to, T cell lymphoma. Similarly, polypeptides and antibodies directed to these 

35 polypeptides are useful in providing immunological probes for differential identification 
of the tissue(s) or cell type(s). For a number of disorders of the above tissues or cells, 
particularly of the immune system, expression of this gene at significantly higher or 
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lower levels may be routinely detected in certain tissues (e.g., cancerous and wounded 
tissues) or bodily fluids (e.g.. serum, plasma, urine, synovial fluid or spinal fluid) or 
another tissue or cell sample taken from an individual having such a disorder, relative to 
the standard gene expression level, i.e., the expression level in healthy tissue or bodily 
5 fluid from an individual not having the disorder. 

The tissue distribution and homology to ATP synthase indicates that 
polynucleotides and polypeptides corresponding to this gene are useful for the treatment 
of defects in proton transport, homeostasis, and metabolism, as well as the diagnosis 
and treatment of lymphoma. Because the gene is expressed in cells of lymphoid origin, 
10 the natural gene product may be involved in immune functions. Therefore it may be also 
used as an agent for immunological disorders including arthritis, asthma, immune 
deficiency diseases such as AIDS, and leukemia 

FEATURES OF PROTEIN ENCODED BY GENE NO: 123 

15 This gene maps to chromosome 15, and therefore, may be used as a marker in 

linkage analysis for chromosome 15. 

This gene is expressed primarily in a variety of fetal tissues, including fetal 
liver, lung, and spleen, and to a lesser extent in a variety of blood cells, including 
eosinophils and T cells. 

20 Therefore, polynucleotides and polypeptides of the invention are useful as 

reagents for differential identification of the tissue(s) or cell type(s) present in a 
biological sample and for diagnosis of diseases and conditions, which include, but are 
not limited to, cancer (abnormal cell proliferation); T cell lymphomas; and hematopoietic 
disorders. Similarly, polypeptides and antibodies directed to these polypeptides are 

25 useful in providing immunological probes for differential identification of the tissue(s) 
or cell type(s). For a number of disorders of the above tissues or cells, particularly of 
the fetus and immune system, expression of this gene at significantly higher or lower 
levels may be routinely detected in certain tissues (e.g., cancerous and wounded 
tissues) or bodily fluids (e.g., serum, plasma, urine, synovial fluid or spinal fluid) or 

30 another tissue or cell sample taken from an individual having such a disorder, relative to 
the standard gene expression level, i.e., the expression level in healthy tissue or bodily 
fluid from an individual not having the disorder. 

The tissue distribution indicates that polynucleotides and polypeptides 
corresponding to this gene are useful for the treatment and diagnosis of conditions 

35 involving cell proliferation. Expression of this gene in fetal tissues, as well as in a 
variety of blood cell lineages indicates that it may play a role in either cellular 
proliferation; apoptosis; or cell survival. Thus it may be useful in the management and 
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treatment of a variety of cancers and malignancies. In addition, its expression in blood 
cells suggest that it may play additional roles in hematopoietic disorders and conditions, 
and could be useful in treating diseases involving autoimmunity, immune modulation, 
immune surveillance, and inflammation.. 

5 

FEATURES OF PROTEIN ENCODED BY GENE NO: 124 

This gene is expressed primarily in placenta and to a lesser extent in pineal gland 
and rhabdomyosarcoma. 

Therefore, polynucleotides and polypeptides of the invention are useful as 

10 reagents for differential identification of the tissue(s) or cell type(s) present in a 

biological sample and for diagnosis of diseases and conditions, which include, but are 
not limited to, developmental, endocrine, and female reproductive disorders. Similarly, 
polypeptides and antibodies directed to these polypeptides are useful in providing 
immunological probes for differential identification of the tissue(s) or cell type(s). For 

1 5 a number of disorders of the above tissues or cells, particularly of the [insert system 
where a related disease state is likely, e.g., immune], expression of this gene at 
significantly higher or lower levels may be routinely detected in certain tissues (e.g., 
cancerous and wounded tissues) or bodily fluids (e.g., serum, plasma, urine, synovial 
fluid or spinal fluid) or another tissue or cell sample taken from an individual having 

20 such a disorder, relative to the standard gene expression level, i.e., the expression level 
in healthy tissue or bodily fluid from an individual not having the disorder. Preferred 
epitopes include those comprising a sequence shown in SEQ ID NO: 357 as residues: 
Leu-69 to Val-76. 

The tissue distribution indicates that polynucleotides and polypeptides 

25 corresponding to this gene are useful for diagnosis and treatment of disorders in 
development. Protein, as well as, antibodies directed against the protein may show 
utility as a tumor marker and immunotherapy targets for the above listed tumors and 
tissues. 

30 FEATURES OF PROTEIN ENCODED BY GENE NO: 125 

This gene is expressed primarily in benign prostatic hyperplasia. 

Therefore, polynucleotides and polypeptides of the invention are useful as 
reagents for differential identification of the tissue(s) or cell type(s) present in a 
biological sample and for diagnosis of benign prostatic hyperplasia. Similarly, 
35 polypeptides and antibodies directed to these polypeptides are useful in providing 

immunological probes for differential identification of the tissue(s) or cell type(s). For 
a number of disorders of the above tissues or cells, particularly of the reproductive 
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system, expression of this gene at significantly higher or lower levels may be routinely 
detected in certain tissues (e.g., cancerous and wounded tissues) or bodily fluids (e.g., 
serum, plasma, urine, synovial fluid or spinal fluid) or another tissue or ceil sample 
taken from an individual having such a disorder, relative to the standard gene 
5 expression level, i.e., the expression level in healthy tissue or bodily fluid from an 
individual not having the disorder. 

The tissue distribution indicates that polynucleotides and polypeptides 
corresponding to this gene are useful for treatment and diagnosis of benign prostatic 
hyperplasia. Protein, as well as, antibodies directed against the protein may show utility 
10 as a tumor marker and immunotherapy targets for the above listed tumors and tissues. 

FEATURES OF PROTEIN ENCODED BY GENE NO: 126 

This gene is expressed primarily in apoptotic T-cells and to a lesser extent in 
suppressor T cells and ulcerative colitis. 

15 Therefore, polynucleotides and polypeptides of the invention are useful as 

reagents for differential identification of the tissue(s) or cell type(s) present in a 
biological sample and for diagnosis of diseases and conditions, which include, but are 
not limited to, diseases involving premature apoptosis, and immunological and 
gastrointestinal disorders. Similarly, polypeptides and antibodies directed to these 

20 polypeptides are useful in providing immunological probes for differential identification 
of the tissue(s) or cell type(s). For a number of disorders of the above tissues or cells, 
particularly of the immune system, expression of this gene at significantly higher or 
lower levels may be routinely detected in certain tissues (e.g., cancerous and wounded 
tissues) or bodily fluids (e.g., serum, plasma, urine, synovial fluid or spinal fluid) or 

25 another tissue or cell sample taken from an individual having such a disorder, relative to 
the standard gene expression level, i.e., the expression level in healthy tissue or bodily 
fluid from an individual not having the disorder. 

The tissue distribution indicates that polynucleotides and polypeptides 
corresponding to this gene are useful for treatment and diagnosis of disorders involving 

30 inappropriate levels of apoptosis, especially in immune cell lineages. Because the gene 
is expressed in cells of lymphoid origin, the natural gene product may be involved in 
immune functions. Therefore it may be also used as an agent for immunological 
disorders including arthritis, asthma, immune deficiency diseases (such as AIDS), and 
leukemia. 
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FEATURES OF PROTEIN ENCODED BY GENE NO: 127 

This gene is expressed primarily in Raji cells. 

Therefore, polynucleotides and polypeptides of the invention are useful as 
reagents for differential identification of the tissue(s) or cell type(s) present in a 
5 biological sample and for diagnosis of diseases and conditions, which include, but are 
not limited to, inflammation and T cell autoimmune disorders. Similarly, polypeptides 
and antibodies directed to these polypeptides are useful in providing immunological 
probes lor differential identification of the tissue(s) or cell type(s). For a number of 
disorders of the above tissues or cells, particularly of the immune system, expression of 

10 this gene at significantly higher or lower levels may be routinely detected in certain 
tissues (e.g., cancerous and wounded tissues) or bodily fluids (e.g., serum, plasma, 
urine, synovial fluid or spinal fluid) or another tissue or cell sample taken from an 
individual having such a disorder, relative to the standard gene expression level, i.e., 
the expression level in healthy tissue or bodily fluid from an individual not having the 

15 disorder. Preferred epitopes include those comprising a sequence shown in SEQ ID 
NO: 360 as residues: Asp-23 to Gly-29. 

The tissue distribution indicates that polynucleotides and polypeptides 
corresponding to this gene are useful for treatment and diagnosis of inflammation and T 
cell autoimmune disorders. Because the gene is expressed in cells of lymphoid origin, 

20 the natural gene product may be involved in immune functions. Therefore it may be also 
used as an agent for immunological disorders including arthritis, asthma, immune 
deficiency diseases (such as AIDS), and leukemia. 

FEATURES OF PROTEIN ENCODED BY GENE NO: 128 
25 The translation product of this gene shares sequence homology with an C. 

elegans coding region C47D12.2 of unknown function (See Accession No. 
gnllPID!e348986). One embodiment for this gene is the polypeptide fragments 
comprising the following amino acid sequence: EDDGFNRSIHEVILKNITWY 
SERVLTEISLGSLLILVVIRTIQYNMTRTRDKYLHTNCLAALANMSAQFRSLHQY 
30 AAQRIISLFSLLSKKHNKVLEQATQSLRGSLSSNDVPLPDYAQDLNVIEEVIRMM 
LEIINSCLTNSLHHNPNLVALLYKRDLFEQFRTHPSFQDLMQNIDLVISFFSSRLL 
QAGS (SEQ ID NO:657); EDDGFNRSIHEVILKNITWYSERVLTEISLGSLLILVV 
(SEQ ID NO:658); RTIQYNMTRTRDKYLHTNCLAALANMSAQFRSLHQYAAQ 
RIISLFSLLSKKHN (SEQ ID NO:659); KKHNKVLEQATQSLRGSLSSNDVPLPDY 
35 AQD (SEQ ID NO:661 ); SCLTNSLHHNPNLVYALLYKRDLFEQFRTHPSFQD 
IMQNIDLVISFFSSRLLQAGS (SEQ ID NO:660). An additional embodiment is the 
polynucleotide fragments encoding these polypeptide fragments. This gene maps to 
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chromosome 18, and therefore, may be used as a marker in linkage analysis for 
chromosome 18. 

This gene is expressed primarily in smooth muscle and to a lesser extent in fetal 

liver. 

5 Therefore, polynucleotides and polypeptides of the invention are useful as 

reagents for differential identification of the tissue(s) or cell type(s) present in a 
biological sample and for diagnosis of diseases and conditions, which include, but are 
not limited to, atherosclerosis and other cardiovascular and hepatic disorders. Similarly, 
polypeptides and antibodies directed to these polypeptides are useful in providing 

10 immunological probes for differential identification of the tissue(s) or cell type(s). For 
a number of disorders of the above tissues or cells, particularly of the circulatory 
system, expression of this gene at significantly higher or lower levels may be routinely 
detected in certain tissues (e.g., cancerous and wounded tissues) or bodily fluids (e.g., 
serum, plasma, urine, synovial fluid or spinal fluid) or another tissue or cell sample 

15 taken from an individual having such a disorder, relative to the standard gene 

expression level, i.e., the expression level in healthy tissue or bodily fluid from an 
individual not having the disorder. 

The tissue distribution indicates that polynucleotides and polypeptides 
corresponding to this gene are useful for diagnosis and treatment of circulatory system 

20 disorders such as atherosclerosis, hypertension, and thrombosis . In addition, the tissue 
distribution indicates that polynucleotides and polypeptides corresponding to this gene 
are useful for the detection and treatment of liver disorders and cancers (e.g. 
hepatoblastoma, jaundice, hepatitis, liver metabolic diseases and conditions that are 
attributable to the differentiation of hepatocyte progenitor cells). In addition the 

25 expression in fetus would suggest a useful role for the protein product in developmental 
abnormalities, fetal deficiencies, pre-natal disorders and various would-healing models 
and/or tissue trauma. 

FEATURES OF PROTEIN ENCODED BY GENE NO: 129 

30 The translation product of this gene shares sequence homology with a nbosomal 

protein which is thought to be important in cellular metabolism, in addition to the 
C.elegans protein F40F1 1 . 1 which does not have a known function at the current time 
(See Accession No. gnl(PIDIe244552 ). Preferred polypeptide fragments comprise the 
following amino acid sequence: 

35 MADIQTERAYQKQFTiPQNKi<J^VLLGETGKEKLPRVTNKNIGLGFKDT 

PRRLLRGTYIDKKCPFTGNVSIRGRILSGVVTQDEDAEDHCHPPRLSALHPQVQ 
PLREAPQEHVCTPVPL LQGRPDR (SEQ ID NO:662); MKMQRTIVIRRDYLH 
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YIRKYNRFEKRHKNMSVHLSPCFRDVQIGDIVTVGLCRPLSKTVRFNVLKVTK 
AAGTKKQFQKF (SEQ ID NO:663); MADIQTERAYQKQF1 IFQNKKRVLLGET 
GK (SEQ ID NO:664); HCHPPRLSALHPQVQPLREAPQEHVCTPVPL LQGRPDR 
(SEQ ID NO:666); NIGLGFKDTPRRLLRGTYIDKKCPFrGNVSIRGRILSGVVTQ 
5 (SEQ ID NO:669); MKMQRTIVIRRDYLHY1RKYNRFEKRHKNMSVHLSP (SEQ 
ID NO:667); CFRDVQIGDIVTVGECRPLSKTVRFNVLKVTKAAGTKKQFQKF 
(SEQ ID NO:668). Also preferred are polynucleotide fragments encoding these 
polypeptide fragments. 

This gene is expressed primarily in Wilm s tumor and to a lesser extent in 

10 thymus and stromal cells. 

Therefore, polynucleotides and polypeptides of the invention are useful as 
reagents for differential identification of the tissue(s) or cell type(s) present in a 
biological sample and for diagnosis of diseases and conditions, which include, but are 
not limited to, diseases affecting RNA translation. Similarly, polypeptides and 

15 antibodies directed to these polypeptides are useful in providing immunological probes 
for differential identification of the tissue(s) or ceil type(s). For a number of disorders 
of the above tissues or cells, particularly of the Wilms tumors, expression of this gene 
at significantly higher or lower levels may be routinely detected in certain tissues (e.g., 
cancerous and wounded tissues) or bodily fluids (e.g., serum, plasma, urine, synovial 

20 fluid or spinal fluid) or another tissue or cell sample taken from an individual having 

such a disorder, relative to the standard gene expression level, i.e., the expression level 
in healthy tissue or bodily fluid from an individual not having the disorder. Preferred 
epitopes include those comprising a sequence shown in SEQ ID NO: 362 as residues: 
Thr-11 to Asp-20. 

25 The tissue distribution and homology to a ribosomal protein indicates that 

polynucleotides and polypeptides corresponding to this gene are useful for diseases 
affecting RNA translation. 

FEATURES OF PROTEIN ENCODED BY GENE NO: 130 

30 The translation product of this gene shares sequence homology with a yeast 

DNA helicase which is thought to be important in global transcriptional regulation (See 
Accession No. gnllPIDIe243594). One embodiment for this gene is the polypeptide 
fragments comprising the following amino acid sequence: IF YDS D WNPT VDQQ A 
MDRAHRLGQTKQVTVYRLICKGTIEER1LQRAKEKSEIQRMVISG (SEQ ID 

35 NO:670); TRMIDLLEEYMVYRKJTTYXRLDGSSKISERRDMVADFQNRNDI 
FVFLLSTRAGGLGENLTAXDTVHF (SEQ ID NO:671): TRMIDLLEEYMVYRK 
HTYXRLDGSSKJSERRDM (SEQ ID NO:674): RRDMVADFQNRNDIFVFLL 
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STRAGGLGIXLTAXDTVHF (SEQ ID NO:675) , IFYDSDWNPTVDQQAMD 
RAHRLGQTKQVTVYRLICKG (SEQ ID NO:676); RUCKGTIEERILQRAK 
EKSEIQRMVISG (SEQ ID NO:678). An additional embodiment is the polynucleotide 
fragments encoding these polypeptide fragments. 
5 This gene is expressed primarily in amygdala. 

Therefore, polynucleotides and polypeptides of the invention are useful as 
reagents for differential identification of the tissue(s) or cell type(s) present in a 
biological sample and for diagnosis of diseases and conditions, which include, but are 
not limited to, diseases and disorders of the brain. Similarly, polypeptides and 

10 antibodies directed to these polypeptides are useful in providing immunological probes 
for differential identification of the tissue(s) or cell type(s). For a number of disorders 
of the above tissues or cells, particularly of the central nervous system, expression of 
this gene at significantly higher or lower levels may be routinely detected in certain 
tissues (e.g., cancerous and wounded tissues) or bodily fluids (e.g., serum, plasma, 

15 urine, synovial fluid or spinal fluid) or another tissue or cell sample taken from an 
individual having such a disorder, relative to the standard gene expression level, i.e., 
the expression level in healthy tissue or bodily fluid from an individual not having the 
disorder. 

The tissue distribution and homology to a DNA helicase indicates that 
20 polynucleotides and polypeptides corresponding to this gene are useful for diseases 
affecting RNA transcription, particularly developmental disorders and healing wounds 
since the later are though to approximate developmental transcriptional regulation. 

FEATURES OF PROTEIN ENCODED BY GENE NO: 131 

25 This gene is expressed primarily in prostate and to a lesser extent in amygdala 

and pancreatic tumors. 

Therefore, polynucleotides and polypeptides of the invention are useful as 
reagents for differential identification of the tissue(s) or cell type(s) present in a 
biological sample and for diagnosis of diseases and conditions, which include, but are 

30 not limited to, prostate enlargement and gastrointestinal disorders, particularly of the 
pancreas and gall bladder. Similarly, polypeptides and antibodies directed to these 
polypeptides are useful in providing immunological probes for differential identification 
of the tissue(s) or cell type(s). For a number of disorders of the above tissues or cells, 
particularly of the reproductive system, expression of this gene at significantly higher or 

35 lower levels may be routinely detected in certain tissues (e.g., cancerous and wounded 
tissues) or bodily fluids (e.g., serum, plasma, urine, synovial fluid or spinal fluid) or 
another tissue or cell sample taken from an individual having such a disorder, relative to 
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the standard gene expression level, i.e., the expression level in healthy tissue or bodily 
fluid from an individual not having the disorder. 

The tissue distribution indicates that polynucleotides and polypeptides 
corresponding to this gene are useful for treatment and diagnosis of prostate diseases. 
5 including benign prostatic hyperplasia and prostate cancer. In addition, the tissue 
distribution in tumors of the pancreas indicates that polynucleotides and polypeptides 
corresponding to this gene are useful for diagnosis and intervention of these tumors, in 
addition to other tissues where expression has been indicated. Protein, as well as, 
antibodies directed against the protein may show utility as a tumor marker and/or 
10 immunotherapy targets for the above listed tumors and tissues. 

FEATURES OF PROTEIN ENCODED BY GENE NO: 132 

This gene is expressed primarily in adult lung and to a lesser extent in 
hypothalamus. 

15 Therefore, polynucleotides and polypeptides of the invention are useful as 

reagents for differential identification of the tissue(s) or cell type(s) present in a 
biological sample and for diagnosis of diseases and conditions, which include, but are 
not limited to, pulmonary diseases and neurological disorders. Similarly, polypeptides 
and antibodies directed to these polypeptides are useful in providing immunological 

20 probes for differential identification of the tissue(s) or cell type(s). For a number of 
disorders of the above tissues or cells, particularly of the pulmonary and respiratory 
systems, expression of this gene at significantly higher or lower levels may be routinely 
detected in certain tissues (e.g., cancerous and wounded tissues) or bodily fluids (e.g., 
serum, plasma, urine, synovial fluid or spinal fluid) or another tissue or cell sample 

25 taken from an individual having such a disorder, relative to the standard gene 

expression level, i.e., the expression level in healthy tissue or bodily fluid from an 
individual not having the disorder. 

The tissue distribution indicates that polynucleotides and polypeptides 
corresponding to this gene are useful for diagnosis and treatment of pulmonary and 

30 respiratory disorders such as emphysema, pneumonia, and pulmonary edema and 
emboli. In addition, the tissue distribution indicates that polynucleotides and 
polypeptides corresponding to this gene are useful for the detection/treatment of 
neurodegenerative disease states and behavioral disorders such as Alzheimer's Disease. 
Parkinson's Disease, Huntington's Disease, schizophrenia, mania, dementia, paranoia, 

35 obsessive compulsive disorder and panic disorder. In addition, the gene or gene 
product may also play a role in the treatment and/or detection of developmental 



WO 98/54963 



r CT/US98/U422 



103 

disorders associated with the developing embryo, sexually-linked disorders, or 
disorders of the cardiovascular system. 

FEATURES OF PROTEIN ENCODED BY GENE NO: 133 

5 This gene is expressed primarily in human liver. 

Therefore, polynucleotides and polypeptides of the invention are useful as 
reagents for differential identification of the tissue(s) or cell type(s) present in a 
biological sample and for diagnosis of diseases and conditions, which include, but are 
not limited to, cirrhosis of the liver and other hepatic disorders. Similarly, polypeptides 

10 and antibodies directed to these polypeptides are useful in providing immunological 
probes for differential identification of the tissue(s ) or cell type(s). For a number of 
disorders of the above tissues or cells, particularly of the digestive system, expression 
of this gene at significantly higher or lower levels may be routinely detected in certain 
tissues (e.g., cancerous and wounded tissues) or bodily fluids (e.g., serum, plasma, 

15 urine, synovial fluid or spinal fluid) or another tissue or cell sample taken from an 
individual having such a disorder, relative to the standard gene expression level, i.e., 
the expression level in healthy tissue or bodily fluid from an individual not having the 
disorder. 

The tissue distribution indicates that polynucleotides and polypeptides 
20 corresponding to this gene are useful for diagnosis and treatment of liver disorders such 
as cirrhosis, jaundice, and Hepatitus. Protein, as well as, antibodies directed against the 
protein may show utility as a tumor marker and/or immunotherapy targets for the above 
listed tissues. 

25 FEATURES OF PROTEIN ENCODED BY GENE NO: 134 

This gene is expressed primarily in fetal kidney and to a lesser extent in fetal 
liver and spleen. 

Therefore, polynucleotides and polypeptides of the invention are useful as 
reagents for differential identification of the tissue(s) or cell type(s) present in a 

30 biological sample and for diagnosis of diseases and conditions, which include, but are 
not limited to, development and regeneration of liver and kidney and immunological 
disorders. Similarly, polypeptides and antibodies directed to these polypeptides are 
useful in providing immunological probes for differential identification of the tissue(s) 
or cell type(s). For a number of disorders of the above tissues or cells, particularly of 

35 the digestive and excretory systems, expression of this gene at significantly higher or 
lower levels may be routinely detected in certain tissues (e.g., cancerous and wounded 
tissues) or bodily fluids (e.g., serum, plasma, urine, synovial fluid or spinal fluid) or 
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another tissue or cell sample taken from an individual having such a disorder, relative to 
the standard gene expression level, i.e., the expression level in healthy tissue or bodily 
fluid from an individual not having the disorder. Preferred epitopes include those 
comprising a sequence shown in SEQ ID NO: 367 as residues: Pro-70 to Arg-77, Tyr- 
5 102toThr-107. 

The tissue distribution indicates that polynucleotides and polypeptides 
corresponding to this gene are useful for diagnosis and treatment of diseases of the 
kidney and liver, such as cirrhosis, kidney failure, kidney stones, and liver failure, 
hepatoblastoma, jaundice, hepatitis, liver metabolic diseases and conditions that are 
10 attributable to the differentiation of hepatocyte progenitor cells. In addition the 

expression in fetus would suggest a useful role for the protein product in developmental 
abnormalities, fetal deficiencies, pre-natal disorders and various would-healing models 
and/or tissue trauma. 

15 FEATURES OF PROTEIN ENCODED BY GENE NO: 135 

This gene is expressed primarily in brain, bone marrow, and to a lesser extent in 
placenta, T cell, testis and neutrophils. 

Therefore, polynucleotides and polypeptides of the invention are useful as 
reagents for differential identification of the tissue(s) or cell type(s) present in a 

20 biological sample and for diagnosis of diseases and conditions, which include, but are 
not limited to, neurodegenerative and immunological diseases and cancer. Similarly, 
polypeptides and antibodies directed to these polypeptides are useful in providing 
immunological probes for differential identification of the tissue(s) or cell type(s). For 
a number of disorders of the above tissues or cells, particularly of the nervous and 

25 immune systems, expression of this gene at significantly higher or lower levels may be 
routinely detected in certain tissues (e.g., cancerous and wounded tissues) or bodily 
fluids (e.g., serum, plasma, urine, synovial fluid or spinal fluid) or another tissue or 
cell sample taken from an individual having such a disorder, relative to the standard 
gene expression level, i.e., the expression level in healthy tissue or bodily fluid from an 

30 individual not having the disorder. Preferred epitopes include those comprising a 
sequence shown in SEQ ID NO: 368 as residues: Met-1 to His-6. 

The tissue distribution indicates that polynucleotides and polypeptides 
corresponding to this gene are useful for the detection/treatment of neurodegenerative 
disease states and behavioral disorders such as Alzheimer's Disease, Parkinson's 

35 Disease, Huntingtons Disease, schizophrenia, mania, dementia, paranoia, obsessive 

compulsive disorder and panic disorder. In addition, the gene or gene product may also 
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play a role in the treatment and/or detection of developmental disorders associated with 
the developing embryo, or sexually-linked disorders. 

FEATURES OF PROTEIN ENCODED BY GENE NO: 136 

5 Translation product of this gene is homologous to the human WD repeat 

protein HANI 1. Preferred polypeptide fragments comprise the following amino acid 
sequence: 

MSLHGKRKEIYKYEAPWTVYAMNWSVRPDKRFRLALGSFVEEYNNKVQLVG 
LDEESSEnCRlSTTFDHPYPTTKLMWIPDTKGVYPDLLATSGDYLRVWRVGETET 

10 RLECLLNNNKNSDFCAPLTSFDWNEVDPYLLGTSSIDTTCTIWGLETGQVLGRV 
NLVSGHVKTQLIAHDKEVYDIAFSRAGGGRDMFASVGADGSVRMFDLRHLE11 
STIIYEDPQHHPLLRLCWNKQDPNYLATMAMDGMEVV1LDVRVPAHLXPGTTIE 
HVSMALLGPHIHPATSALQRMTTRLSSGTSSKCPEPLRTLSWPTQLXGEINNVQ 
WASTQPELSPS ATTTAWRYSECS VGGAVPTRQGLLYFLPLPHPQS (SEQ ID 

15 NO:679);MSLHGKRKEIYKYEAPWTVYAMNWSVRPDKRFRLALGSFV 

EEYNNKVQLVGLDEESSEFICRNTFDHPYPTTKLMWIPDTKGVYPDLLATSGDY 
LRVWRVGETETRLECLLNNNKNSDFCAPLTSFDWNEVDPYLL (SEQ ID 
NO:680 l; SFDWNEVDPYLLGTSSIDTTCTIWGLETGQ VLGRVNLVSGH VK 
TQL1AHDKEVYDIAFSRAGGGRDMFASVGADGSVRMFDLRHLEHSTIIYEDPQH 

20 HPLLRLCWNKQDPNYLATMAMDGMEVVILDVRVPAHLXPGTTI (SEQ ID 
NO:68 1 >; VGADGS VRMFDLRHLEHSTIIYEDPQHHPLLRLCWNKQDPN YLA 
TMAMDGMEVVILDVRVPAHLXPGTTmHVSMALLGPHIHPATSALQRMTTRLS 
SGTSSKCPEPLRTLSWPTQLXGEINNVQWASTQPELSPSATTTAWRYSECSVG 
GAVPTRQGLLYFLPLPHPQS (SEQ ID NO:682). Also preferred are polynucleotide 

25 fragments encoding these polypeptide fragments. 

This gene is expressed primarily in placenta, embryo, T cell and fetal lung and 
to a lesser extent in endothelial, tonsil and bone marrow. 

Therefore, polynucleotides and polypeptides of the invention are useful as 
reagents for differential identification of the tissue(s) or cell type(s) present in a 

30 biological sample and for diagnosis of diseases and conditions, which include, but are 
not limited to, immunological and developmental diseases in addition to cancers. 
Similarly, polypeptides and antibodies directed to these polypeptides are useful in 
providing immunological probes for differential identification of the tissuc(s) or cell 
type(s). For a number of disorders of the above tissues or cells, particularly of the 

35 immune system, expression of this gene at significantly higher or lower levels may be 
routinely detected in certain tissues (e.g., cancerous and wounded tissues) or bodily 
fluids (e.g., serum, plasma, urine, synovial fluid or spinal fluid) or another tissue or 
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cell sample taken from an individual having such a disorder, relative to the standard 
gene expression level, i.e., the expression level in healthy tissue or bodily fluid from an 
individual not having the disorder. Preferred epitopes include those comprising a 
sequence shown in SEQ ID NO: 369 as residues: Gly-19 to Gln-28, Pro-36 to Phe-42. 



that polynucleotides and polypeptides corresponding to this gene are useful for 
diagnosis and intervention of these tumors, in addition to other tumors where 
expression has been indicated. Protein, as well as, antibodies directed against the 
protein may show utility as a tumor marker and/or immunotherapy targets for the above 
10 listed tumors and tissues. Because the gene is expressed in cells of lymphoid origin, the 
natural gene product may be involved in immune functions. Therefore it may be also 
used as an agent for immunological disorders including arthritis, asthma, immune 
deficiency diseases such as AIDS, and leukemia. 

15 FEATURES OF PROTEIN ENCODED BY GENE NO: 137 

This gene is expressed primarily in TNF and INF induced epithelial cells, T 
cells and kidney. 

Therefore, polynucleotides and polypeptides of the invention are useful as 
reagents for differential identification of the tissue(s) or cell type(s > present in a 

20 biological sample and for diagnosis of diseases and conditions, which include, but are 
not limited to, inflammatory conditions particularly inflammatory reactions in the 
kidney. Similarly, polypeptides and antibodies directed to these polypeptides are useful 
in providing immunological probes for differential identification of the tissue(s) or cell 
type(s). For a number of disorders of the above tissues or cells, particularly of renal 

25 system, expression of this gene at significantly higher or lower levels may be routinely 
detected in certain tissues (e.g., cancerous and wounded tissues) or bodily fluids (e.g., 
serum, plasma, urine, synovial fluid or spinal fluid) or another tissue or cell sample 
taken from an individual having such a disorder, relative to the standard gene 
expression level, i.e., the expression level in healthy tissue or bodily fluid from an 

30 individual not having the disorder. Preferred epitopes include those comprising a 

sequence shown in SEQ ID NO: 370 as residues: Thr-67 to Gly-72, Gin- 132 to Ala- 
145, Arg-150 to Pro-157. 

The tissue distribution indicates that the protein products of this gene are useful 
for treating the damage caused by inflammation of the kidney. 



The tissue distribution in tumors of colon, ovary, and breast origins indicates 
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FEATURES OF PROTEIN ENCODED BY GENE NO: 138 

This gene maps to chromosome 1, and therefore, may be used as a marker in 
linkage analysis for chromosome 1 (See Accession No. D63485). 

This gene is expressed primarily in breast cancer and colon cancer and to a 
5 lesser extent in thymus and fetal spleen. 

Therefore, polynucleotides and polypeptides of the invention are useful as 
reagents for differential identification of the tissue(s) or cell type(s) present in a 
biological sample and for diagnosis of diseases and conditions, which include, but are 
not limited to, cancers, especially of the breast and colon tissues. Similarly, 
10 polypeptides and antibodies directed to these polypeptides are useful in providing 

immunological probes for differential identification of the tissue(s) or cell type(s). For 
a number of disorders of the above tissues or cells, particularly of the immune system, 
expression of this gene at significantly higher or lower levels may be routinely detected 
in certain tissues (e.g., cancerous and wounded tissues) or bodily fluids (e.g., serum, 
15 plasma, urine, synovial fluid or spinal fluid) or another tissue or cell sample taken from 
an individual having such a disorder, relative to the standard gene expression level, i.e., 
the expression level in healthy tissue or bodily fluid from an individual not having the 
disorder. 

The tissue distribution in tumors of colon and breast origins indicates that 
20 polynucleotides and polypeptides corresponding to this gene are useful for diagnosis 

and intervention of these tumors, in addition to other tumors where expression has been 
indicated. Protein, as well as, antibodies directed against the protein may show utility as 
a tumor marker and/or immunotherapy targets for the above listed tumors and tissues. 

25 FEATURES OF PROTEIN ENCODED BY GENE NO: 139 

This gene maps to chromosome 17, and therefore, can be used as a marker for 
linkage analysis from chromosome 17. 

This gene is expressed primarily in CD34 positive cells, and to lesser extent in 
activated T-cells and neutrophils. 

30 Therefore, polynucleotides and polypeptides of the invention are useful as 

reagents for differential identification of the tissue(s) or cell type(s) present in a 
biological sample and for diagnosis of diseases and conditions, which include, but are 
not limited to, immunologically related diseases and hematopoietic disorders. Similarly, 
polypeptides and antibodies directed to these polypeptides are useful in providing 

35 immunological probes for differential identification of the tissue(s) or cell type(s). For 
a number of disorders of the above tissues or cells, particularly of the immune system 
and hematopoietic system, expression of this gene at significantly higher or lower levels 
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may be routinely detected in certain tissues (e.g., cancerous and wounded tissues) or 
bodily fluids (e.g., serum, plasma, urine, synovial fluid or spinal fluid) or another 
tissue or cell sample taken from an individual having such a disorder, relative to the 
standard gene expression level, i.e., the expression level in healthy tissue or bodily 
5 fluid from an individual not having the disorder. 

The tissue distribution in CD34, T-cell and neutrophils indicates that 
polynucleotides and polypeptides corresponding to this gene are useful for treatment 
and diagnosis of hematopoietic disorders and immunologically related diseases, such as 
anemia, leukemia, inflammation, infection, allergy, immunodeficiency disorders, 
10 arthritis, asthma, immune deficiency diseases such as AIDS. 

FEATURES OF PROTEIN ENCODED BY GENE NO: 140 

This gene was recently cloned by another group, who called the gene 
KIAA0313 gene. (See Accession No. dl021609.) Preferred polypeptide fragments 

1 5 comprise the amino acid sequence: 

LYATATVISSPSTEXLSQDQGDRASLDAADSGRGSWTSCSSGSHDN1QTIQ 
HQRSWETLPFGHTHFDYSGDPAGLWASSSHMDQIMFSDHSTKYNRQNQSRES 
LEQAQSRASWASSTGYWGEDSEGDTGTIKRRGGKDVSrEAESSSLTSVTTEETK 
PVPMPAHIAVASSTTKGLIARKEGRYREPPPTPPGYIGIPITDFPEGHSHPARKP 

20 PDYNVALQRSRMVARSSDTAGPSSVQQPHGHPTSSRPVNKPQWHKXNESDPR 
LAPYQSQGFSTEEDEDEQVSAV (SEQ ID NO:683); HMDQLMFSDHSTKYNRQ 
NQSRESLEQAQSRASWASSTGYWGE (SEQ ID NO:684); SVTTEETKPVPMP 
AHIAVASSTTKGLIARKEGRYREPPPTPPGYIGIPITD (SEQ ID NO:685); and 
VALQRSRMVARSSDTAGPSSVQQPHGHPTSSRPVNKPQW 

25 HKXNESDPRLAPYQSQGF (SEQ ID NO:686). Also preferred are polynucleotide 
fragments encoding these polypeptide fragments. This gene maps to chromosome 4, 
and therefore, may be used as a marker in linkage analysis for chromosome 4 (See 
Accession No. AB00231 1 ). 

This gene is expressed primarily in ovarian cancer, tumors of the Testis, brain, 

30 and colon. 

Therefore, polynucleotides and polypeptides of the invention are useful as 
reagents for differentia] identification of the tissue(s) or cell type(s) present in a 
biological sample and for diagnosis of diseases and conditions, which include, but are 
not limited to, ovarian, testicle, brain and colon cancers. Similarly, polypeptides and 
35 antibodies directed to these polypeptides are useful in providing immunological probes 
for differential identification of the tissue(s) or cell type(s). For a number of disorders 
of the above tissues or cells, particularly of the male and female reproductive systems. 
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expression of this gene at significantly higher or lower levels may be routinely detected 
in certain tissues (e.g., cancerous and wounded tissues) or bodily fluids (e.g.. serum, 
plasma, urine, synovial fluid or spinal fluid) or another tissue or cell sample taken from 
an individual having such a disorder, relative to the standard gene expression level, i.e., 
5 the expression level in healthy tissue or bodily fluid from an individual not having the 
disorder. 

The tissue distribution in tumors of colon, ovary, testis, and brain origins 
indicates that polynucleotides and polypeptides corresponding to this gene are useful for 
diagnosis and intervention of these tumors, in addition to other tumors where 
10 expression has been indicated. Protein, as well as, antibodies directed against the 

protein may show utility as a tumor marker and/or immunotherapy targets for the above 
listed tumors and tissues. 



Therefore, polynucleotides and polypeptides of the invention are useful as 
reagents for differential identification of the tissue(s) or cell type(s) present in a 
biological sample and for diagnosis of diseases and conditions, which include, but are 
not limited to, colon cancer and immunological disorders. Similarly, polypeptides and 

20 antibodies directed to these polypeptides are useful in providing immunological probes 
for differential identification of the tissue(s) or cell type(s). For a number of disorders 
of the above tissues or cells, particularly of the gastrointestinal trace and immune 
systems, expression of this gene at significantly higher or lower levels may be routinely 
detected in certain tissues (e.g., cancerous and wounded tissues) or bodily fluids (e.g., 

25 serum, plasma, urine, synovial fluid or spinal fluid) or another tissue or cell sample 
taken from an individual having such a disorder, relative to the standard gene 
expression level, i.e., the expression level in healthy tissue or bodily fluid from an 
individual not having the disorder. 



30 that polynucleotides and polypeptides corresponding to this gene are useful for 
diagnosis and intervention of these tumors, in addition to other tumors where 
expression has been indicated. Protein, as well as, antibodies directed against the 
protein may show utility as a tumor marker and/or immunotherapy targets for the above 
listed tumors and tissues. 



15 



FEATURES OF PROTEIN ENCODED BY GENE NO: 141 

This gene is expressed primarily in spleen and colon cancer. 



The tissue distribution in tumors of colon, ovary, and breast origins indicates 
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FEATURES OF PROTEIN ENCODED BY GENE NO: 142 

Translation product is homologous to T cell translocation protein, a putative zinc 
finger factor (See Accession No. 340454), as well as to the G-protein coupled receptor 
TM5 consensus polypeptide (See Accession No. R50734). Preferred polypeptide 
5 fragments comprise the following amino acid sequence: 

CLLFVFVSLGMRCLFWTIVYNVLYLKHKCNTVLLCYHLCSI (SEQ ID NO:687); 
ACSKLIPAFEMVMRAKDNVYHLDCFACQLCNQRXCVGDKFFLKNNXXLCQT 
DYEEGLMKEGYAPXVR (SEQ ID NO:688). Also preferred arc polynucleotide 
fragments encoding these polypeptide fragments. 

10 This gene is expressed primarily in fetal brain. 

Therefore, polynucleotides and polypeptides of the invention are useful as 
reagents for differential identification of the tissue(s) or cell type(s) present in a 
biological sample and for diagnosis of diseases and conditions, which include, but are 
not limited to, neurological disorders including brain cancer. Similarly, polypeptides 

15 and antibodies directed to these polypeptides are useful in providing immunological 
probes for differential identification of the tissue(s) or cell type(s). For a number of 
disorders of the above tissues or cells, particularly of the Central Nervous System, 
expression of this gene at significantly higher or lower levels may be routinely detected 
in certain tissues (e.g., cancerous and wounded tissues) or bodily fluids (e.g., serum, 

20 plasma, urine, synovial fluid or spinal fluid) or another tissue or cell sample taken from 
an individual having such a disorder, relative to the standard gene expression level, i.e., 
the expression level in healthy tissue or bodily fluid from an individual not having the 
disorder. 

The tissue distribution indicates that polynucleotides and polypeptides 
25 corresponding to this gene are useful for the detection/treatment of neurodegenerative 
disease states and behavioral disorders such as Alzheimer's Disease, Parkinson's 
Disease, Huntingtons Disease, schizophrenia, mania, dementia, paranoia, obsessive 
compulsive disorder and panic disorder. In addition, the gene or gene product may also 
play a role in the treatment and/or detection of developmental disorders associated with 
30 the developing embryo. 

FEATURES OF PROTEIN ENCODED BY GENE NO: 143 

Translation product for this gene has significant homology to the Fas ligand, 
which is a cysteine-rich type II transmembrane protein/tumor necrosis factor receptor 
35 homolog. Mutations within this protein have been shown to result in generalized 
lymphoproliferative disease leading to the development of lymphadenopathy and 
autoimmune disease (See Medline Article No. 94185175). Preferred polypeptide 
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fragments comprise the following amino acid sequence: 

SALSEPGAPDRRRPCPESVPRRPDDEQWPPPTALCLDVAPLPPSS (SEQ ID 
NO:689). Also preferred are polynucleotide fragments encoding these polypeptide 
fragments (See Accession No. 473565). 
5 This gene is expressed primarily in osteoblasts, lung, and brain. 

Therefore, polynucleotides and polypeptides of the invention are useful as 
reagents for differential identification of the tissue(s) or cell type(s) present in a 
biological sample and for diagnosis of diseases and conditions, which include, but are 
not limited to. osteoblast-related, pulmonary, neurological, and immunological 

10 diseases. Similarly, polypeptides and antibodies directed to these polypeptides are 

useful in providing immunological probes for differential identification of the tissue(s) 
or cell type(s). For a number of disorders of the above tissues or cells, particularly of 
the skeletal and nervous systems, expression of this gene at significantly higher or 
lower levels may be routinely detected in certain tissues (e.g., cancerous and wounded 

15 tissues) or bodily fluids (e.g., serum, plasma, urine, synovial fluid or spinal fluid) or 
another tissue or cell sample taken from an individual having such a disorder, relative to 
the standard gene expression level, i.e., the expression level in healthy tissue or bodily 
fluid from an individual not having the disorder. Preferred epitopes include those 
comprising a sequence shown in SEQ ID NO: 376 as residues: Trp-33 to Thr-40, Lys- 

20 45 toIlc-63. 

The tissue distribution in osteoblasts, lung, and brain combined with its 
homology to the Fas ligand indicates that polynucleotides and polypeptides 
corresponding to this gene are useful for diagnosis and intervention of these tumors, in 
addition to other tumors where expression has been indicated. Protein, as well as, 

25 antibodies directed against the protein may show utility as a tumor marker and/or 

immunotherapy targets for the above listed tumors and tissues. Because the Fas ligand 
gene is known to be expressed in cells of lymphoid origin, the natural gene product 
may be involved in immune functions. Therefore it may be also used as an agent for 
immunological disorders including asthma, immune deficiency diseases such as AIDS 

30 and leukemia, and various autoimmune disorders including lupus and arthritis. 

FEATURES OF PROTEIN ENCODED BY GENE NO: 144 

This gene shares sequence homology with a 21.5 KD transmembrane protein in 
the SEC15-SAP4 intergenic region of yeast. (See Accession No. 1723971.) Preferred 
35 polypeptide fragments comprise the amino acid sequence: 

AHASESGERWWACCGVRFGLRSIEAIGRSCCHDGPGGLVANRGRRFKWAIEL 
SGPGGGSRGRSDRGSGQGDSLYPVGYLDKQVPDTSVQETDRn.VEKRCWDIAL 
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11: 

GPI.KQIPMNXHMYMAGNTISIFPTMN1VCMMA\VRPIQALMA1SATFK\1LESSSQ 
KFLQGLVYLIGNLMGLALAVYKCQSMGLLPTHASDWLAFIEPPERMEFSGG 
GLLL (SEQ ID NO:691): PVGYLDKQVPDTSVQETDRILVEKRCWDIALGPLKQ 
IPMNLFI (SEQ ID NO:693); and ATFKMLESSSQKFLQGLVYL1GNLMGLALAV 
5 YKCQSMGLLPTHASD (SEQ ID NO:692). Also preferred are polynucleotide 
fragments encoding these polypeptide fragments. 

This gene is expressed primarily in osteoclastoma, hemangiopericytoma, liver, 

lung. 

Therefore, polynucleotides and polypeptides of the invention are useful as 
10 reagents for differential identification of the tissue(s) or cell type(s) present in a 

biological sample and for diagnosis of diseases and conditions, which include, but are 
not limited to, osteoclastoma, hemangiopericytoma, liver and lung tumors. Similarly, 
polypeptides and antibodies directed to these polypeptides are useful in providing 
immunological probes for differential identification of the above tissue(s) or cell 
15 type(s). For a number of disorders of the above tissues or cells, particularly of the lung 
and liver systems, expression of this gene at significantly higher or lower levels may be 
routinely detected in certain tissues (e.g., cancerous and wounded tissues) or bodily 
fluids (e.g., serum, plasma, urine, synovial fluid or spinal fluid) or another tissue or 
cell sample taken from an individual having such a disorder, relative to the standard 
20 gene expression level, i.e., the expression level in healthy tissue or bodily fluid from an 
individual not having the disorder. 

The tissue distribution indicates that polynucleotides and polypeptides 
corresponding to this gene are useful for diagnosing osteoclastoma, 
hemangiopericytoma, liver and lung tumors. 

25 

FEATURES OF PROTEIN ENCODED BY GENE NO: 145 

Translation product of this gene shares homology with the glucagon-69 gene 
which may indicate this gene plays a role in regulating metabolism. (See Accession No. 
A60318 ) One embodiment for this gene is the polypeptide fragments comprising the 

30 following amino acid sequence: 

PTTKLDIMEKKKMIQIRFPSFYHKLVDSGRMRSKJIETRREDSDTKHNL (SEQ ID 
NO:694>. An additional embodiment is the polynucleotide fragments encoding these 
polypeptide fragments. 

This gene is expressed primarily in brain, kidney, colon, and testis. 

35 Therefore, polynucleotides and polypeptides of the invention are useful as 

reagents for differential identification of the tissue(s) or cell type(s) present in a 
biological sample and for diagnosis of diseases and conditions, which include, but are 
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not limited to, brain, kidney, colon, and testicular cancer. Similarly, polypeptides and 
antibodies directed to these polypeptides are useful in providing immunological probes 
for differential identification of the tissue(s) or cell type(s). For a number of disorders 
of the above tissues or cells, particularly of the male reproductive system, neurological, 
5 circulatory, and gastrointestinal systems, expression of this gene at significantly higher 
or lower levels may be routinely detected in certain tissues (e.g., cancerous and 
wounded tissues) or bodily fluids (e.g., serum, plasma, urine, synovial fluid or spinal 
fluid) or another tissue or cell sample taken from an individual having such a disorder, 
relative to the standard gene expression level, i.e., the expression level in healthy tissue 

10 or bodily fluid from an individual not having the disorder. 

The tissue distribution in tumors of brain, kidney, colon, and testis origins, 
indicates that polynucleotides and polypeptides corresponding to this gene are useful for 
diagnosis and intervention of these tumors, in addition to other tumors where 
expression has been indicated. Protein, as well as, antibodies directed against the 

1 5 protein may show utility as a tumor marker and/or immunotherapy targets for the above 
listed tumors and tissues. The tissue distribution indicates that polynucleotides and 
polypeptides corresponding to this gene are useful for the detection/treatment of 
neurodegenerative disease states and behavioral disorders such as Alzheimer's Disease, 
Parkinson's Disease, Huntingtons Disease, schizophrenia, mania, dementia, paranoia, 

20 obsessive compulsive disorder and panic disorder. In addition, the gene or gene 
product may also play a role in the treatment and/or detection of developmental 
disorders associated with the developing embryo, sexually-linked disorders, or 
disorders of the cardiovascular system. 

25 FEATURES OF PROTEIN ENCODED BY GENE NO: 146 

The translation product of this gene shares sequence homology with goliath 
protein which is thought to be important in the regulation of gene expression during 
development. Protein may serve as a transcription factor. One embodiment for this gene 
is the polypeptide fragments comprising the following amino acid sequence: 
30 TEHIIAVMITELRGKDILSYLEK^ 

LMIISSAWLIFYHQKJRYTNAI^DRNQR 

PDFTDHCAVCIESYKQNDVVRILPCKHWHKSCVDPWLSEHCTCPMCKLNILKA 
LGIV (SEQ ID NO:695); TEHIIAVMITELRGKDILSYLEKNISVQMTIAVGTRMP 
PKNFSRGSLVFVSISFIVLM IISSAWLIFYF (SEQ ID NO:697); SISFIVLMIISSA 
35 WLIFYHQKJRYTNARDRNORRLGDAAKKAISKLTTRTVKKGDKE (SEQ ID 
NO:698); VKKGDKETDPDFDHCAVCIESYKQNDVVRJLPCKHVFHKSCVDP 
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WLSEHCTCPMCKLNILKALGIV (SEQ ID N0 699). An additional embodiment is 
the polynucleotide fragments encoding these polypeptide fragments (See Accession No. 
157535). Moreover, another embodiment is the polynucleotide fragments encoding 
these polypeptide fragments: 
5 MTHPGTEHIIAVMITELRGKDILSYLEKNISVQMTIAVGTRMPPKNFSRGS 

LVFVSISFIVLMI1SSAVVLIFYFIQKIRYTNARDRNQRRLGDAAKKAISKLTTRTV 
KKGDKETDPDFDHCAVCIESYKQNDVVRILPCKHVFHKSCVDPWLSEHCTCP 
MCKLNILKALGIVPNLPCTDNVAFDMERLTRTQAVNRRSALGDLAGDNSLGLE 
PLRTSGISPLPQDGELTPRTGEINIAVTKEWFIIASFGLLSALTLCYMIIRATASLN 

10 ANEVEWF (SEQ ID NO:696):MTHPGTEHIIAVMITELRGKDILSYLEKNISVQM 
TIAVGTRMPPKNFSRGSLVFVSISFIVLMIISSAWLIFYFIQKJRYTNARDRNQRR 
LGDAAKKAISKLTTRT (SEQ ID NO:700); AAKKAISKLTTRTVKKGDKE 
TDPDFDHCAVCIESYKQNDVVRILPCKHVFHKSCVDPWLSEHCTCPMCKLNIL 
KALGIVPNLPC (SEQ ID NO:701); TQAVNRRSALGDLAGDNSLGLEPLRTSGI 

1 5 SPLPQDGELTPRTGEINIAVTKEWFIIASFGLLS ALTLCYMIIRATASLNANEVEW 
F (SEQ ID NO:702); PLHGVADHLGCDPQTRFFVPPNIKQWIALLQRGNCTF 
KEKISRAAFHNAVAVVIYNNKSKEEPVTMTHPGTEHIIAVMITELRGKDILSYLE 
KNISVQMTIAVGTRMPPKNFSRGSLVFVSISFIVLMIISSAWLIFYFIQKIRYTNA 
RDRNQRPXGDAAKKAISKLTTRTVKKGDKETDPDFDHCAVCIESYKQNDVVRI 

20 LPCKHVFHKSCVDPWLSEHCTCPMCKLNILKALGIVPNLPCTDNVAFDMERLT 
RTQAVNRRSALGDLAGDNSLGLEPLRTSGISPLPQDGELTPRTGEINIAVTKEW 
FIIASFGLLSALTLCYMIIRATASLNANEVEWF(SEQ ID NO:703); and 
HGVADHLGCDPQTRFFVPPNIKQWIALLQRGNCTFKEKISRAAFHNAVAVVIY 
NNKSKEE (SEQ ID NO:704). An additional embodiment is the polynucleotide 

25 fragments encoding these polypeptide fragments. When tested against Jurkat cell lines, 
supematants removed from cells containing this gene activated the GAS pathway. 
Thus, it is likely that this gene activates immune cells through the JAKS/STAT signal 
transduction pathway. 

This gene is expressed primarily in macrophage, breast, kidney and to a lesser 

30 extent in synovium, hypothalamus and rhabdomyosarcoma. 

Therefore, polynucleotides and polypeptides of the invention are useful as 
reagents for differential identification of the tissue(s) or cell type(s) present in a 
biological sample and for diagnosis of diseases and conditions, which include, but are 
not limited to, schizophrenia and cancer. Similarly, polypeptides and antibodies directed 

35 to these polypeptides are useful in providing immunological probes for differential 
identification of the tissue(s) or cell type(s). For a number of disorders of the above 
tissues or cells, particularly of the immune and neural system, expression of this gene at 
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significantly higher or lower levels may be routinely detected in certain tissues (e.g., 
cancerous and wounded tissues) or bodily fluids (e.g., serum, plasma, urine, synovial 
fluid or spinal fluid) or another tissue or cell sample taken from an individual having 
such a disorder, relative to the standard gene expression level, i.e.. the expression level 
5 in healthy tissue or bodily fluid from an individual not having the disorder. 

The tissue distribution and homology to zinc finger protein indicates that 
polynucleotides and polypeptides corresponding to this gene are useful for the treatment 
of schizophrenia, kidney disease and other cancers. The tissue distribution in 
macrophage, breast, and kidney origins indicates that polynucleotides and polypeptides 

10 corresponding to this gene are useful for diagnosis and intervention of tumors within 
these tissues, in addition to other tumors where expression has been indicated. Protein, 
as well as, antibodies directed against the protein may show utility as a tumor marker 
and/or immunotherapy targets for the above listed tumors and tissues. Because the gene 
is expressed in cells of lymphoid origin, the natural gene product may be involved in 

15 immune functions. Therefore it may be also used as an agent for immunological 

disorders including arthritis, asthma, immune deficiency diseases such as AIDS, and 
leukemia. 

FEATURES OF PROTEIN ENCODED BY GENE NO: 147 

20 The translation product of this gene shares sequence homology with HNP36 

protein, an equilibrative nucleoside transporter, which is thought to be important in 
gene transcription as well as serving as an important component of the nucleoside 
transport apparatus (See Accession No. 1845345). One embodiment for this gene is 
the polypeptide fragments comprising the following amino acid sequence: 

25 MSGC^LAGrTASVAMICAIASGSELSESATO 

YYQQLKLEGPGEQETKLDLISKGEEPRAGKEESGVSVSNSQPTNESHSIKAILK 
NISVLAFSVCFIFTITIGMFPAVTVEVKSSIAGSSWERYFIPVSCFLTFNIFDWLG 
RSLTAVFMWPGKI)SRWLPSWXLARLVFVPLLLLCNIKPRRYLTVVFEHDAWH 
rTMAAFAFSNGYLASLCMCFGPKKVKPAEAETAEPSWPSSCVWVWHWGLFS 

30 PSCSGQLCDKGWTEGLPASLPVCLLPLPS ARGDPEWSGGFFF (SEQ ID 
NO:705);MSGQGLAGFFASVAMICAIASGSELSESAFGYFITACAVirLTnC 
YLGLPRLEFYRYYQQLKLE GPGEQETKLDLIS KGEEPRAGKEESG VS VSNSQ 
PTNESHSI (SEQ ID NO:706); SGVSVSNSQPTNESHSIKAILKNISVLAFSVCFI 
FTITIGMFPAVTVEVKSSIAGSSTWERYFIPVSCFLTFNIFDWLGRS (SEQ ID 

35 NO:707),TIGMFPAVTVEVKSSIAGSSTWERYnPVSCFLTFNIFDWLGRSLTAVF 
MWPGKDSRWLPS^LARLVFVPLLLLCNIK PRRYLTVVFEHDA (SEQ ID 
NO:708 ); FGPKKVKPAEAETAEPSWPSSCVWVWHWGLFSPSCSGQLCDK 
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GWTEGLPASLPVCLLPLPSARGDPEWSGGFFF (SEQ ID NO:709). An additional 
embodiment is the polynucleotide fragments encoding these polypeptide fragments. 

This gene is expressed primarily in eosinophils and aortic endothelium and to a 
lesser extent in umbilical vein endothelial cell and thymus. 
5 Therefore, polynucleotides and polypeptides of the invention are useful as 

reagents for differential identification of the tissue(s) or cell type(s) present in a 
biological sample and for diagnosis of diseases and conditions, which include, but are 
not limited to, hematopoietic disease. Similarly, polypeptides and antibodies directed to 
these polypeptides are useful in providing immunological probes for differential 

10 identification of the tissue(s) or cell type(s). For a number of disorders of the above 
tissues or cells, particularly of the circular system, expression of this gene at 
significantly higher or lower levels may be routinely detected in certain tissues (e.g., 
cancerous and wounded tissues) or bodily fluids (e.g., serum, plasma, urine, synovial 
fluid or spinal fluid) or another tissue or cell sample taken from an individual having 

15 such a disorder, relative to the standard gene expression level, i.e., the expression level 
in healthy tissue or bodily fluid from an individual not having the disorder. 

The tissue distribution and homology to HNP36 protein indicates that 
polynucleotides and polypeptides corresponding to this gene are useful for the treatment 
of blood neoplasias and other hematopoietic disease. 

20 

FEATURES OF PROTEIN ENCODED BY GENE NO: 148 

This gene is expressed primarily in breast cancer cell lines, thymus stromal 
cells, and ovary. 

Therefore, polynucleotides and polypeptides of the invention are useful as 
25 reagents for differential identification of the tissue(s) or cell type(s) present in a 

biological sample and for diagnosis of diseases and conditions, which include, but are 
not limited to, endocrine and female reproductive system diseases including breast 
cancer. Similarly, polypeptides and antibodies directed to these polypeptides are useful 
in providing immunological probes for differential identification of the tissue(s) or cell 
30 type(s). For a number of disorders of the above tissues or cells, particularly of the 

endocrine system, expression of this gene at significantly higher or lower levels may be 
routinely detected in certain tissues (e.g., cancerous and wounded tissues) or bodily 
fluids (e.g., serum, plasma, urine, synovial fluid or spinal fluid) or another tissue or 
cell sample taken from an individual having such a disorder, relative to the standard 
35 gene expression level, i.e., the expression level in healthy tissue or bodily fluid from an 
individual not having the disorder. 
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The tissue distribution indicates that polynucleotides and polypeptides 
corresponding to this gene are useful for the diagnosis and treatment of endocrine 
disorders. In addition, the tissue distribution in tumors of thymus, ovary, and breast 
origins indicates that polynucleotides and polypeptides corresponding to this gene are 
5 useful for diagnosis and intervention of these tumors, in addition to other tumors where 
expression has been indicated. Protein, as well as, antibodies directed against the 
protein may show utility as a tumor marker and/or immunotherapy targets for the above 
listed tumors and tissues 

10 FEATURES OF PROTEIN ENCODED BY GENE NO: 149 

Translation product of this gene has homology to pmtl and pmt 2, two 
conserved schizosaccharomyces pombe genes. One embodiment for this gene is the 
polypeptide fragments comprising the following amino acid sequence: 
DDDGFEIVPIEDPAKHRILDPEGLALGAVIASSKKAKRDLIDNSFNRYTFNEDEG 

15 ELPEWFVQEEKQHRIRQLPVGKKEVEHYRKRWREINARPIXXXXXXXXXXX 
XXXXXXLEQTRKKAEAVVNTVDIXRTRES (SEQ ID NO:710); 
DDDGFEIVPIEDPAKHR1LDPEGLALGAVIASSKKAKRDLIDNSFNRYTF(SEQ 
ID NO:71 1); KRWREINARPIXXXXXXXXXXXXXXXXXLEQTRKKAE 
AVVNTVDIXRTRES (SEQ ID NO:712). An additional embodiment is the 

20 polynucleotide fragments encoding these polypeptide fragments (See Accession No. 
el216734). 

This gene is expressed primarily in retina and ovary and to a lesser extent in 
brreast cancer cell, epididymus and osteosarcoma. 

Therefore, polynucleotides and polypeptides of the invention are useful as 

25 reagents for differential identification of the tissue(s) or cell type(s) present in a 

biological sample and for diagnosis of diseases and conditions, which include, but are 
not limited to, neuronal growth disorders, cancer and reproductive system disorders. 
Similarly, polypeptides and antibodies directed to these polypeptides are useful in 
providing immunological probes for differential identification of the tissue(s) or cell 

30 type(s). For a number of disorders of the above tissues or cells, particularly of the 

neural and reproductive system, expression of this gene at significantly higher or lower 
levels may be routinely detected in certain tissues (e.g., cancerous and wounded 
tissues) or bodily fluids (e.g., serum, plasma, urine, synovial fluid or spinal fluid) or 
another tissue or cell sample taken from an individual having such a disorder, relative to 

35 the standard gene expression level, i.e., the expression level in healthy tissue or bodily 
fluid from an individual not having the disorder. Preferred epitopes include those 
comprising a sequence shown in SEQ ID NO: 382 as residues: Met-1 to Gly-7. 
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The tissue distribution indicates that polynucleotides and polypeptides 
corresponding to this gene are useful for the diagnosis or treatment of reproductive 
system disease and cancers. 

5 FEATURES OF PROTEIN ENCODED BY GENE NO: 150 

One embodiment for this gene is the polypeptide fragments comprising the following 
amino acid sequence: 

MIKDKGRARTALTSSQPAHLCPENPLLHLKAAVKEKKRNKKKKTIGSPKRIQS 
PLNNKLLNSPAKTLPGACGSPQKLIDGFLKHEGPPAEKPLEELSASTSGVPGLS 
10 SLQSDPAGCVRPPAPNLAGAVEFNDVKTL^ 
LIEEKI)LEKLDLVIKYMKRLMQQSV 

VT (SEQ ID NO:713); MIKDKGRARTALTSSQPAHLCPENPLLHLKAAVKE 
KKRNKKKKTIGSPKRIQ (SEQ ID NO:714); KRIQSPLNNKLLNSPAKT 
LPGACGSPQKLIDGFLKHEGPPAEKPLEELSASTSGVPGLSSLQSDPAGCVRPP 

1 5 APNLAGA VEFNDVKTLLREWITTISDPM (SEQ ID NO:7 15); 

TISDPMEEDILQVVKYCTOLEEKDLEKLDLVIKYMKJILMQQSVE 
SVWNMAFDFILDNVQVVLQQTYGSTLKVT (SEQ ID NO:716). An additional 
embodiment is the polynucleotide fragments encoding these polypeptide fragments. 
This gene is expressed primarily in 12 week embryo and to a lesser extent in 

20 hemangiopericytoma and frontal cortex. 

Therefore, polynucleotides and polypeptides of the invention are useful as 
reagents for differential identification of the tissue(s) or cell type(s) present in a 
biological sample and for diagnosis of diseases and conditions, which include, but are 
not limited to, growth disorders and hemangiopericytoma. Similarly, polypeptides and 

25 antibodies directed to these polypeptides are useful in providing immunological probes 
for differential identification of the tissue(s) or cell type(s). For a number of disorders 
of the above tissues or cells, particularly of the circular and neural system, expression 
of this gene at significantly higher or lower levels may be routinely detected in certain 
tissues (e.g., cancerous and wounded tissues) or bodily fluids (e.g., serum, plasma, 

30 urine, synovial fluid or spinal fluid) or another tissue or cell sample taken from an 
individual having such a disorder, relative to the standard gene expression level, i.e., 
the expression level in healthy tissue or bodily fluid from an individual not having the 
disorder. Preferred epitopes include those comprising a sequence shown in SEQ ID 
NO: 383 as residues: Leu-4 to Lys- 1 1 . 

35 The tissue distribution indicates that polynucleotides and polypeptides 

corresponding to this gene are useful for the treatment of growth disorders, 
hemangiopericytoma and other soft tissue tumors. 
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FEATURES OF PROTEIN ENCODED BY GENE NO: 151 

The translation product of this gene has been found to have homology to a 
human DNA mismatch repair protein PMS3. Preferred polypeptide fragments comprise 
5 the following amino acid sequence: FCHDCKFPEASPAMNCEP (SEQ ID NO:717). 
Also preferred are polynucleotide fragments encoding these polypeptide fragments (See 
Accession No. R95250). 



10 reagents for differential identification of the tissue(s) or cell typc(s) present in a 

biological sample and for diagnosis of diseases and conditions, which include, but are 
not limited to, lymphoma, immunodeficiency diseases, and cancers resulting from 
genetic instability. Similarly, polypeptides and antibodies directed to these polypeptides 
are useful in providing immunological probes for differential identification of the 

15 tissue(s) or cell type(s). For a number of disorders of the above tissues or cells, 

particularly of the immune system, expression of this gene at significantly higher or 
lower levels may be routinely detected in certain tissues (e.g., cancerous and wounded 
tissues) or bodily fluids (e.g., serum, plasma, urine, synovial fluid or spinal fluid) or 
another tissue or cell sample taken from an individual having such a disorder, relative to 

20 the standard gene expression level, i.e., the expression level in healthy tissue or bodily 
fluid from an individual not having the disorder. Preferred epitopes include those 
comprising a sequence shown in SEQ ID NO: 384 as residues: Met-1 to Lys-6. 

The tissue distribution in neutrophils and the sequence homology indicates that 
polynucleotides and polypeptides corresponding to this gene are useful for diagnosis of 

25 Hodgkin's lymphoma, since the elevated expression and secretion by the tumor mass 
may be indicative of tumors of this type. Additionally the gene product may be used as 
a target in the immunotherapy of the cancer. Because the gene is expressed in cells of 
lymphoid origin, the natural gene product may be involved in immune functions. 
Therefore it may be also used as an agent for immunological disorders including 

30 arthritis, asthma, immune deficiency diseases such as AIDS, and leukemia. 

Furthermore, its homology to a known DNA repair protein would suggest gene may be 
useful in establishing cancer predisposition and prevention in gene therapy applications. 



This gene is expressed primarily in neutrophils. 

Therefore, polynucleotides and polypeptides of the invention are useful as 



FEATURES OF PROTEIN ENCODED BY GENE NO: 152 

This gene is expressed primarily in neutrophils. 

Therefore, polynucleotides and polypeptides of the invention are useful as 
reagents for differential identification of the tissue(s) or cell type(s) present in a 
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biological sample and for diagnosis of diseases and conditions, which include, but are 
not limited to. infectious diseases and lymphoma. Similarly, polypeptides and 
antibodies directed to these polypeptides are useful in providing immunological probes 
for differential identification of the tissue(s) or cell type(s). For a number of disorders 
5 of the above tissues or cells, particularly of the immune system, expression of this gene 
at significantly higher or lower levels may be routinely detected in certain tissues (e.g., 
cancerous and wounded tissues) or bodily fluids (e.g., serum, plasma, urine, synovial 
fluid or spinal fluid) or another tissue or cell sample taken from an individual having 
such a disorder, relative to the standard gene expression level, i.e., the expression level 
10 in healthy tissue or bodily fluid from an individual not having the disorder. 

The tissue distribution indicates that polynucleotides and polypeptides 
corresponding to this gene are useful for the treatment of inflammation and infectious 
diseases. 

15 FEATURES OF PROTEIN ENCODED BY GENE NO: 153 

One embodiment for this gene is the polypeptide fragments comprising the 
following amino acid sequence: 

MASSVPAGGHTRAGGIFLIGKLDLEASLFKSFQWLPFVLRKKC 
NFFCWDSSAHSLPLHPLSASCSAPACHASDTHLLYPSTRALCPSIFAWLVAPHS 

20 VFRTNAPGPTPSSQSSPVFPVFPVSFMALIVCXLVCC (SEQ ID NO:720); 

MASSVPAGGHTRAGGIFLIGKLDLEASLFKSFQWLPFVLRKKCNFFCWDSSAH 
SLPLHPLSASCSAPACHA (SEQ ID NO:721);FAWLVAPHSVFRTNAPGPTPS 
SQSSPVFPVFPVSFMALIVCXLVCC (SEQ ID NO:722). An additional embodiment 
is the polynucleotide fragments encoding these polypeptide fragments. 

25 This gene is expressed primarily in neutrophils. 

Therefore, polynucleotides and polypeptides of the invention are useful as 
reagents for differential identification of the tissue(s) or cell type(s) present in a 
biological sample and for diagnosis of diseases and conditions, which include, but are 
not limited to, inflammation and infectious disease. Similarly, polypeptides and 

30 antibodies directed to these polypeptides are useful in providing immunological probes 
for differential identification of the tissue(s) or cell type(s). For a number of disorders 
of the above tissues or cells, particularly of the immune system, expression of this gene 
at significantly higher or lower levels may be routinely detected in certain tissues (e.g.. 
cancerous and wounded tissues) or bodily fluids (e.g., serum, plasma, urine, synovial 

35 fluid or spinal fluid) or another tissue or cell sample taken from an individual having 

such a disorder, relative to the standard gene expression level, i.e., the expression level 
in healthy tissue or bodily fluid from an individual not having the disorder. Preferred 
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epitopes include those comprising a sequence shown in SEQ ID NO: 386 as residues: 
Ser-li to Pro- 17. 

The tissue distribution indicates that polynucleotides and polypeptides 
corresponding to this gene are useful for the treatment of infectious diseases and 
5 inflammation. 



FEATURES OF PROTEIN ENCODED BY GENE NO: 154 

This gene is expressed in multiple tissues including ovary, uterus, adipose 
tissue, brain, and the liver. 

10 Therefore, polynucleotides and polypeptides of the invention are useful as 

reagents for differential identification of the tissue(s) or cell typc(s) present in a 
biological sample and for diagnosis of diseases and conditions, which include, but are 
not limited to, uterine, ovarian, brain, and liver cancer. Similarly, polypeptides and 
antibodies directed to these polypeptides are useful in providing immunological probes 

15 for differential identification of the tissue(s) or cell type(s). For a number of disorders 
of the above tissues or cells, particularly of the female reproductive system, expression 
of this gene at significantly higher or lower levels may be routinely detected in certain 
tissues (e.g., cancerous and wounded tissues) or bodily fluids (e.g., serum, plasma, 
urine, synovial fluid or spinal fluid) or another tissue or cell sample taken from an 

20 individual having such a disorder, relative to the standard gene expression level, i.e., 
the expression level in healthy tissue or bodily fluid from an individual not having the 
disorder. 

The tissue distribution of this gene indicates that polynucleotides and 
polypeptides corresponding to this gene are useful for diagnostic or therapeutic uses in 
25 the treatment of the female reproductive system, obesity, and liver disorders, 
particularly cancer in the above tissues. 

FEATURES OF PROTEIN ENCODED BY GENE NO: 155 

This gene maps to chromosome 3, and therefore, may be used as a marker in 
30 linkage analysis for chromosome 3 (See Accession No. D87452). 

This gene is expressed in multiple tissues including brain, aortic endothelial 
cells, smooth muscle, pituitary, testis, melancytes, spleen, nertrophils, and placenta. 

Therefore, polynucleotides and polypeptides of the invention are useful as 
reagents for differential identification of the tissue(s) or cell type(s) present in a 
35 biological sample and for diagnosis of diseases and conditions, which include, but are 
not limited to, immunological disorders including immunodeficiencies, cancers of the 
brain and the female reproductive system, as well as cardiovascular disorders, such as 
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atherosclerosis and stroke. Similarly, polypeptides and antibodies directed to these 
polypeptides are useful in providing immunological probes for differential identification 
of the tissue(s) or cell type(s). For a number of disorders of the above tissues or cells, 
particularly of the central nervous and immune systems, expression of this gene at 
5 significantly higher or lower levels may be routinely detected in certain tissues (e.g., 
cancerous and wounded tissues) or bodily fluids (e.g.. serum, plasma, urine, synovial 
fluid or spinal fluid) or another tissue or cell sample taken from an individual having 
such a disorder, relative to the standard gene expression level, i.e., the expression level 
in healthy tissue or bodily fluid from an individual not having the disorder. 
10 The tissue distribution suggest that polynucleotides and polypeptides 

corresponding to this gene are useful in treatment/detection of disorders in the nervous 
system, including schizophrenia, neurodegeneration, neoplasia, brain cancer as well as 
cardiovascular and female reproductive disorders including cancer within the above 
tissues. 

15 

FEATURES OF PROTEIN ENCODED BY GENE NO: 156 

The translation product of this gene shares sequence homology with the human 
gene encoding cytochrome b561 (See Accession No. PI 0897). Cytochrome b561 is a 
transmembrane electron transport protein that is specific to a subset of secretory vesicles 
20 containing catecholamines and amidated peptides. This protein is thought to supply 
reducing equivalents to the intravesicular enzymes dopamine-beta-hydroxylase and 
alpha-peptide amidase. Preferred polypeptides of the invention comprise the amino acid 
sequence: 

MAMEGYWRFLALLGSALLVGFLSVIFALW 

25 VLMVTGFVFIQG1AIIVYRLPWTWKCSKLLMKSIHAGLNAVAAILADSVVAVFE 
NHNVNNIANMYSLHSWVGLIAVICYLLQLLSGFSVFLLPWAPLSLRAFLMPIHV 
YSGIVIFGTVIATALMGLTEKLIFSLRDPAYSTFPPEGVFVNTLGLLILVFGALIF 
WIVTRPQWKRPKEPNSTILHPNGGTEQGARGSMPAYSGNNMDKSDSEL 
NSEVAARKRNLALDEAGQRSTM (SEQ ID NO:724); as well as antigenic fragments 

30 of at least 20 amino acids of this gene and/or biologically active fragments. Also 
preferred are polynucleotide fragments encoding these polypeptide fragments. 
This gene is expressed primarily in anergic T-cells. 
Therefore, polynucleotides and polypeptides of the invention are useful as 
reagents for differential identification of the tissue(s) or cell type(s) present in a 

35 biological sample and for diagnosis of diseases and conditions, which include, but are 
not limited to, immune system and metabolism related diseases. Similarly, polypeptides 
and antibodies directed to these polypeptides are useful in providing immunological 
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probes for differential identification of the tissue(s) or cell type(s). For a number of 
disorders of the above tissues or cells, particularly of the immune system, expression of 
this gene at significantly higher or lower levels may be routinely detected in certain 
tissues (e.g., cancerous and wounded tissues) or bodily fluids (e.g., serum, plasma, 
5 urine, synovial fluid or spinal fluid) or another tissue or cell sample taken from an 
individual having such a disorder, relative to the standard gene expression level, i.e., 
the expression level in healthy tissue or bodily fluid from an individual not having the 
disorder. 

The tissue distribution indicates that the protein product or RNA of this gene is 
10 useful for treatment or diagnosis of immune system and metabolic diseases or 
conditions including Tay-Sachs disease, phenylketonuria, galactosemia, various 
porphyrias, and Hurler's syndrome. 

FEATURES OF PROTEIN ENCODED BY GENE NO: 157 

15 The translation product of this gene shares sequence homology with collagen 

which is important in mammalian development. This gene also shows sequence 
homology with bcl-2. (See Accession No. P80988.) Preferred polypeptide fragments 
comprise the amino acid sequence: PGRAGPSPGLSLQLPAEPGHPAGNLAPL 
TSRPQPLCRIPAVPG (SEQ ID NO:725). Also preferred are polynucleotide 

20 sequences encoding this polypeptide fragment. 

This gene is expressed primarily in HL-60 tissue culture cells and to a lesser 
extent in liver, breast, and uterus. 

Therefore, polynucleotides and polypeptides of the invention are useful as 
reagents for differential identification of the tissue(s) or cell type(s) present in a 

25 biological sample and for diagnosis of diseases and conditions, which include, but are 
not limited to, immunological diseases, hereditary disorders involving the MHC class 
of immune molecules, as well as developmental disorders and reproductive disorders. 
Similarly, polypeptides and antibodies directed to these polypeptides are useful in 
providing immunological probes for differential identification of the tissue(s) or cell 

30 type(s). For a number of disorders of the above tissues or cells, particularly of the 
immune and reproductive system expression of this gene at significantly higher or 
lower levels may be routinely detected in certain tissues (e.g., cancerous and wounded 
tissues) or bodily fluids (e.g., serum, plasma, urine, synovial fluid or spinal fluid) or 
another tissue or cell sample taken from an individual having such a disorder, relative to 

35 the standard gene expression level, i.e., the expression level in healthy tissue or bodily 
fluid from an individual not having the disorder. Preferred epitopes include those 
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comprising a sequence shown in SEQ ID NO: 390 as residues: Ser-39 to Gly-46, Lcu- 
49 to Ala-62. 

The tissue distribution and homology to collagen indicates that polynucleotides 
and polypeptides corresponding to this gene are useful for diagnosis and treatment of 
5 hereditary MHC disorders and particularly autoimmune disorders including rheumatoid 
arthritis, lupus, scleroderma, and dermatomyositis, as well as many reproductive 
disorders, including cancer of the uterus, and breast tissues. 

FEATURES OF PROTEIN ENCODED BY GENE NO: 158 

10 This gene is expressed primarily in the amygdala region of the brain. 

Therefore, polynucleotides and polypeptides of the invention are useful as 
reagents for differential identification of the tissuc(s) or cell type(s) present in a 
biological sample and for diagnosis of diseases and conditions, which include, but are 
not limited to, a variety of brain disorders, particularly those effecting mood and 

15 personality. Similarly, polypeptides and antibodies directed to these polypeptides are 
useful in providing immunological probes for differentia] identification of the tissue(s) 
or cell type(s). For a number of disorders of the above tissues or cells, particularly of 
the brain and central nervous system, expression of this gene at significantly higher or 
lower levels may be routinely detected in certain tissues (e.g., cancerous and wounded 

20 tissues) or bodily fluids (e.g., serum, plasma, urine, synovia] fluid or spinal fluid) or 
another tissue or cell sample taken from an individual having such a disorder, relative to 
the standard gene expression level, i.e., the expression level in healthy tissue or bodily 
fluid from an individual not having the disorder. 

The tissue distribution indicates that polynucleotides and polypeptides 

25 corresponding to this gene are useful for treatment and/or diagnosis of a variety of brain 
disorders, particularly bipolar disorder, unipolar depression, and dementia. 

FEATURES OF PROTEIN ENCODED BY GENE NO: 159 

This gene is expressed in a variety of tissues and cell types including brain, 
30 smooth muscle, kidney, salivary gland and T-cells. 

Therefore, polynucleotides and polypeptides of the invention are useful as 
reagents for differentia] identification of the tissue(s) or cell type(s) present in a 
biological sample and for diagnosis of diseases and conditions, which include, but are 
not limited to, cancers of a variety of organs including brain, smooth muscle, kidney, 
35 salivary gland and T-cells and cardiovascular disorders. Similarly, polypeptides and 
antibodies directed to these polypeptides are useful in providing immunological probes 
for differentia] identification of the tissue(s) or cell type(s). For a number of disorders 
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of the above tissues or cells, particularly of the central nervous, urinary, salivary, 
digestive, and immune systems, expression of this gene at significantly higher or lower 
levels may be routinely detected in certain tissues (e.g., cancerous and wounded 
tissues) or bodily fluids (e.g., serum, plasma, urine, synovial fluid or spinal fluid) or 
5 another tissue or cell sample taken from an individual having such a disorder, relative to 
the standard gene expression level, i.e., the expression level in healthy tissue or bodily 
fluid from an individual not having the disorder. 

The tissue distribution in brain, smooth muscle, and T-cells indicates that 
polynucleotides and polypeptides corresponding to this gene are useful for diagnosis of 

10 various neurological, and cardiovascular disorders, but not limited to cancer within the 
above tissues. Additionally the gene product may be used as a target in the 
immunotherapy of the cancer. Because the gene is expressed in cells of lymphoid 
origin, the natural gene product may be involved in immune functions. Therefore it may 
be also used as an agent for immunological disorders including arthritis, asthma, 

15 immune deficiency diseases such as AIDS, and leukemia. 

FEATURES OF PROTEIN ENCODED BY GENE NO: 160 

The translation product of this gene shares sequence homology with collagen 
which is thought to be important in cellular interactions, extracellular matrix formation, 

20 and has been found to be an identifying determinant in autoimmune disorders. 

Moreover, this gene shows sequence homology with the yeast protein, Slslp, an 
endoplasmic reticulum component, involved in the protein translocation process in 
Yeast Yarrowia lipolytica. (See Accession No. 1052828; see also J. Biol. Chem. 271, 
1 1668-1 1675 (1996).) With mouse, this same region shows sequence homology with 

25 the heavy chain of kinesin. (See Accession No. 2062607.) Recently, suppression of the 
heavy chain of kinesin was shown to inhibits insulin secretion from primary cultures of 
mouse beta-cells. (See Endocrinology 138 (5), 1979-1987 (1997).) Moreover, kinesin 
was found associated with drug resistance and cell immortalization. (See 468355.) 
Thus, it is likely that this gene also act as a genetic suppressor elements. 

30 This gene is expressed primarily in the greater omentum and to a lesser extent in 

a variety of organs and cell types including gall bladder, stromal bone marrow cells, 
lymph node, liver, testes, pituitary, and thymus. 

Therefore, polynucleotides and polypeptides of the invention are useful as 
reagents for differential identification of the tissue(s) or cell type(s) present in a 

35 biological sample and for diagnosis of diseases and conditions, which include, but are 
not limited to, disorders of the endocrine, gastrointestinal, and immunological systems, 
including autoimmune disorders and cancers in a variety of organs and cell types. 
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Similarly, polypeptides and antibodies directed to these polypeptides are useful in 
providing immunological probes for differential identification of the tissue(s) or cell 
typc(s). For a number of disorders of the above tissues or cells, particularly of the 
immune and gastrointestinal systems, expression of this gene at significantly higher or 
5 lower levels may be routinely detected in certain tissues (e.g.. cancerous and wounded 
tissues) or bodily fluids (e.g., serum, plasma, urine, synovial fluid or spinal fluid) or 
another tissue or cell sample taken from an individual having such a disorder, relative to 
the standard gene expression level, i.e., the expression level in healthy tissue or bodily 
fluid from an individual not having the disorder. Preferred epitopes include those 
10 comprising a sequence shown in SEQ ID NO: 393 as residues: Asn-27 to Leu-47, Gln- 
81 to Lys-88, Asp-93 to Lys-102, Asn-107 to Leu- 1 16, Met- 129 to Glu-141, Glu-150 
to Asp- 157, Lys-176 to Glu-185, Glu-333 to Tyr-349, Cys-393 to Leu-403, Gln-423 
to Gly-429. 

The tissue distribution in within various endocrine and immunological tissues 
1 5 combined with the sequence homology to a conserved collagen motif indicates that 
polynucleotides and polypeptides corresponding to this gene are useful for the 
diagnosis of various autoimmune disorders including, but not limited to, rheumatoid 
arthritis, lupus erthyematosus, scleroderma, dermatomyositis Because the gene is 
expressed in cells of lymphoid origin, the natural gene product may be involved in 
20 immune functions. Therefore it may be also used as an agent for immunological 

disorders including arthritis, asthma, immune deficiency diseases such as AIDS, and 
leukemia. 

FEATURES OF PROTEIN ENCODED BY GENE NO: 161 

25 This gene has homology to the tissue inhibitor of metalloproteinase 2. Such 

inhibitors are vital to proper regulation of metal loproteins such as collagenases (See 
Accession No. PI 6368). In addition, this gene maps to chromosome 17, and 
therefore, may be used as a marker in linkage analysis for chromosome 17 (See 
Accession No. PI 6368). 

30 This gene is expressed primarily in several types of cancer including 

osteoclastoma, chondrosarcoma, and rhabdomyosarcoma and to a lesser extent in 
several non-malignant tissues including synovium, amygdala, testes, placenta. 

Therefore, polynucleotides and polypeptides of the invention are useful as 
reagents for differential identification of the tissue(s) or cell type(s) present in a 

35 biological sample and for diagnosis of diseases and conditions, which include, but are 
not limited to, various types of cancer, particularly cancers of bone and cartilage, as 
well as various autoimmune disorders. Similarly, polypeptides and antibodies directed 
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to these polypeptides are useful in providing immunological probes for differential 
identification of the tissue(s) or cell type(s). For a number of disorders of the above 
tissues or cells, particularly of the musculoskeletal system, expression of this gene at 
significantly higher or lower levels may be routinely detected in certain tissues (e.g., 
5 cancerous and wounded tissues) or bodily fluids (e.g., serum, plasma, urine, synovial 
fluid or spinal fluid) or another tissue or cell sample taken from an individual having 
such a disorder, relative to the standard gene expression level, i.e., the expression level 
in healthy tissue or bodily fluid from an individual not having the disorder. 

The tissue distribution in various cancers and the sequence homology to a 
10 collagenase inhibitor indicates that polynucleotides and polypeptides corresponding to 
this gene are useful for detection of various autoimmune disorders such as rheumatoid 
arthritis, lupus, scleroderma, and dermatomyositis. Therefore it may be also used as an 
agent for immunological disorders including arthritis, asthma, immune deficiency 
diseases such as AIDS, and leukemia. 

15 

FEATURES OF PROTEIN ENCODED BY GENE NO: 162 

This gene is homologous to the mitochondrial ATP6 gene and therefore is likely 
a homolog of this gene family (See Accession No. X76197). 
This gene is expressed primarily in brain tissue. 

20 Therefore, polynucleotides and polypeptides of the invention are useful as 

reagents for differential identification of the tissue(s) or cell type(s) present in a 
biological sample and for diagnosis of diseases and conditions, which include, but are 
not limited to, a variety of brain disorders, including Down's syndrome, depression, 
Schizophrenia, and epilepsy. Similarly, polypeptides and antibodies directed to these 

25 polypeptides are useful in providing immunological probes for differential identification 
of the tissue(s) or cell type(s). For a number of disorders of the above tissues or cells, 
particularly of the central nervous system, expression of this gene at significantly higher 
or lower levels may be routinely detected in certain tissues (e.g., cancerous and 
wounded tissues) or bodily fluids (e.g., serum, plasma, urine, synovial fluid or spinal 

30 fluid) or another tissue or cell sample taken from an individual having such a disorder, 
relative to the standard gene expression level, i.e., the expression level in healthy tissue 
or bodily fluid from an individual not having the disorder. 

The tissue distribution in brain tissue indicates this gene is useful for diagnosis 
of various neurological disorders including, but not limited to, brain cancer. 

35 Additionally the gene product may be used as a target in the immunotherapy of cancer in 
the brain as well as for the diagnosis of metabolic disorders such as obesity Tay-Sachs 
disease, phenylketonuria and Hurler's Syndrome. 
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FEATURES OF PROTEIN ENCODED BY GENE NO: 163 

This gene is expressed primarily in placenta, neutrophils, and microvascular 
endothelial cells and to a lesser extent in multiple tissues including brain, prostate, 
5 spleen, thymus, and bone. 

Therefore, polynucleotides and polypeptides of the invention are useful as 
reagents for differential identification of the tissue(s) or cell type(s) present in a 
biological sample and for diagnosis of diseases and conditions, which include, but are 
not limited to, neutropenea and other diseases of the immune system. Similarly, 
10 polypeptides and antibodies directed to these polypeptides are useful in providing 

immunological probes for differential identification of the tissue(s) or cell type(s). For 
a number of disorders of the above tissues or cells, particularly of the immune system, 
expression of this gene at significantly higher or lower levels may be routinely detected 
in certain tissues (e.g., cancerous and wounded tissues) or bodily fluids (e.g., serum, 
15 plasma, urine, synovial fluid or spinal fluid) or another tissue or cell sample taken from 
an individual having such a disorder, relative to the standard gene expression level, i.e., 
the expression level in healthy tissue or bodily fluid from an individual not having the 
disorder. 

The tissue distribution in placenta indicates that polynucleotides and 
20 polypeptides corresponding to this gene are useful for diagnosis various female 

reproductive disorders. Additionally the gene product may be used as a target in the 
immunotherapy of various cancers. Because the gene is expressed in some cells of 
lymphoid and endocrine origin, the natural gene product may be involved in immune 
functions and metabolism regulation, respectively. Therefore it may be also used as an 
25 agent for immunological disorders including arthritis, asthma, immune deficiency 
diseases such as AIDS, and leukemia. 

FEATURES OF PROTEIN ENCODED BY GENE NO: 164 

This gene is expressed primarily in neutrophils, monocytes, bone marrow, and 
30 fetal liver. 

Therefore, polynucleotides and polypeptides of the invention are useful as 
reagents for differential identification of the tissue(s) or cell type(s) present in a 
biological sample and for diagnosis of diseases and conditions, which include, but are 
not limited to, immune system disorders including, but not limited to. autoimmune 
35 disorders such as lupus, and immunodeficiency disorders . Similarly, polypeptides and 
antibodies directed to these polypeptides are useful in providing immunological probes 
for differential identification of the tissue(s) or cell type(s). For a number of disorders 
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of the above tissues or cells, particularly of the immune system, expression of this gene 
at significantly higher or lower levels may be routinely detected in certain tissues (e.g., 
cancerous and wounded tissues) or bodily fluids (e.g.. scrum, plasma, urine, synovial 
fluid or spinal fluid) or another tissue or cell sample taken from an individual having 
5 such a disorder, relative to the standard gene expression level, i.e., the expression level 
in healthy tissue or bodily fluid from an individual not having the disorder. 

The tissue distribution in various immune system tissue indicates that 
polynucleotides and polypeptides corresponding to this gene are useful for the 
diagnosis of various immunological disorders such as Hodgkin's lymphoma, arthritis, 
10 asthma, immune deficiency diseases such as AIDS, and leukemia. 

FEATURES OF PROTEIN ENCODED BY GENE NO: 165 

The translation product of this gene shares sequence homology with dystrophin 
which is thought to be defective in both Duchene and Becker Muscular Dystrophy. 

1 5 Preferred polypeptide fragments comprise the following amino acid sequence: 

MKLLGECSSSIDSVKRLEHKLKEEEESLPGFVNLHSTETQTAGVIDRWELLQAQ 
ALSKELRMKQNLQKWQQFNSDLNSIWAWLGDTEEELEQLQRLELSTDIQTIELQ 
IKKLKELQKAVDHRKAIILSINLCSPEFrOADSKESRDLQDRLXQMNGRWDRV 
CSLLEEWRGLLQDALMQCQGFHEMSHGLLLMLENIDRRKNE1VPIDSNLDAEIL 

20 QDHHKQLMQIKHELLESQLRVASLQDMSCQLLVNAEGTDCLEAKEKVHVIGNR 
LKLLLKEVSRHIKELEKLLDVSSSQQDLSSWSSADELDTSGSVSPXSGRSTPNR 
QKTPRGKCSLSQPGPSVSSPHSRSTKGGSDSSLSEPXPGRSGRGFLFRVLRAA 
LPLQLLLLLLIGLACLVPMSEEDYSCALSNNFARSFHPMLRYTNGPPPL (SEQ ID 
NO:726);MKLLGECSSSIDSVKRLEHKLKEEEESLPGFVNLHSTETQTAGVIDR 

25 WELLQAQALSKELRMKQNLQKWQQFNSDLNSIWAWLGDTEEELEQLQRLELS 
TDIQTIELQ1K (SEQ ID NO:727); KLKELQKAVDHRKAIILSLNLCSPEFTQADSK 
ESRDLQDRLXQMNGRWDRVCSLLEEWRGLLQDALMQCQGFHEMSHGLLLML 
ENIDRRKNEIVPIDSNLDAEILQDHHKQLMQIKHELLESQLRVASLQDMSCQL 
(SEQ ID NO:728); QDMSCQLLVNAEGTDCLEAKEKVHVIGNRLKLLLKEVS 

30 RHIKELEKLLDVSSSQQDLSSWSSADELDTSGSVSPXSGRSTPNRQKTPRGKCS 
LSQPGPSVSSPHS (SEQ ID NO:729); DSSLSEPXPGRSGRGFLFRVLRAAL 
PLQLLLLLLIGLACLVPMSEEDYSCALSNNFARSFHPMLRYTNGPPPL (SEQ ID 
NO:730). Also preferred are polynucleotide fragments encoding these polypeptide 
fragments. Furthermore, this gene maps to chromosome 6, and therefore, may be used 

35 as a marker in linkage analysis for chromosome 6 (See Accession No. N62896). 

This gene is expressed in numerous tissues including the heart, kidney, and 

brain. 
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Therefore, polynucleotides and polypeptides of the invention are useful as 
reagents for differential identification of the tissue(s) or cell type(s) present in a 
biological sample and for diagnosis of diseases and conditions, which include, but are 
not limited to, musculoskeletal disorders including Muscular Dystrophy and 
5 cardiovascular diseases. Similarly, polypeptides and antibodies directed to these 

polypeptides are useful in providing immunological probes for differential identification 
of the tissue(s) or cell type(s). For a number of disorders of the above tissues or cells, 
particularly of the muscle tissues, expression of this gene at significantly higher or 
lower levels may be routinely detected in certain tissues (e.g., cancerous and wounded 

10 tissues) or bodily fluids (e.g., serum, plasma, urine, synovial fluid or spinal fluid) or 
another tissue or cell sample taken from an individual having such a disorder, relative to 
the standard gene expression level, i.e., the expression level in healthy tissue or bodily 
fluid from an individual not having the disorder. 

The tissue distribution and homology to dystrophin indicates that 

15 polynucleotides and polypeptides corresponding to this gene are useful for the 
diagnosis and treatment of Muscular Dystrophy and other muscle disorders. 

FEATURES OF PROTEIN ENCODED BY GENE NO: 166 

This gene is expressed primarily in human cerebellum. 

20 Therefore, polynucleotides and polypeptides of the invention are useful as 

reagents for differential identification of the tissue(s) or cell type(s) present in a 
biological sample and for diagnosis of diseases and conditions, which include, but are 
not limited to, diseases of the central nervous system, including Alzheimer's Disease, 
Parkinson's Disease, ALS, and mental illnesses. Similarly, polypeptides and antibodies 

25 directed to these polypeptides are useful in providing immunological probes for 

differential identification of the tissue(s) or cell type(s). For a number of disorders of 
the above tissues or cells, particularly of the central nervous system, expression of this 
gene at significantly higher or lower levels may be routinely detected in certain tissues 
(e.g., cancerous and wounded tissues) or bodily fluids (e.g., serum, plasma, urine, 

30 synovial fluid or spinal fluid) or another tissue or cell sample taken from an individual 
having such a disorder, relative to the standard gene expression level, i.e., the 
expression level in healthy tissue or bodily fluid from an individual not having the 
disorder. Preferred epitopes include those comprising a sequence shown in SEQ ID 
NO: 399 as residues: Pro-20 to Gly-26, Leu-37 to Pro-42, His-57 to Gly-63. 

35 The tissue distribution indicates that the protein products of this gene are useful 

for treatment/diagnosis of diseases of the central nervous system and may protect or 
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enhance survival of neuronal cells by slowing progression of neurodegenerative 
diseases. 



FEATURES OF PROTEIN ENCODED BY GENE NO: 167 

5 Preferred polypeptides encoded by this gene comprise the following amino acid 

sequence: 

MKLLICGNYLAPSHSESSRRCCLLCFYPLCLEINFGMKVFLSMPFLVLFQ 
SLIQED (SEQ ID NO:73 1 ). Polynucleotides encoding such polypeptides are also 
provided. This gene is believed to reside on chromosome 15. Therefore polynucleotides 
10 derived from this gene are useful in linkage analysis as chromosome 15 markers. 

This gene is expressed primarily in human testes tumor and to a lesser extent in 
normal human testes. 

Therefore, polynucleotides and polypeptides of the invention are useful as 
reagents for differential identification of the tissue(s) or cell type(s) present in a 
15 biological sample and for diagnosis of diseases and conditions, which include, but are 
not limited to, diseases of the testes, particularly cancer, and other reproductive 
disorders. Similarly, polypeptides and antibodies directed to these polypeptides are 
useful in providing immunological probes for differential identification of the tissue(s) 
or cell type(s). For a number of disorders of the above tissues or cells, particularly of 
20 the male reproductive tissues, expression of this gene at significantly higher or lower 
levels may be routinely detected in certain tissues (e.g., cancerous and wounded 
tissues) or bodily fluids (e.g., serum, plasma, urine, synovial fluid or spinal fluid) or 
another tissue or cell sample taken from an individual having such a disorder, relative to 
the standard gene expression level, i.e., the expression level in healthy tissue or bodily 
25 fluid from an individual not having the disorder. 

The tissue distribution indicates that the protein products of this gene are useful 
for treatment/diagnosis of testicular diseases including cancers. 

FEATURES OF PROTEIN ENCODED BY GENE NO: 168 

30 This gene is expressed primarily in fetal liver. 

Therefore, polynucleotides and polypeptides of the invention are useful as 
reagents for differential identification of the tissue(s) or cell type(s) present in a 
biological sample and for diagnosis of diseases and conditions, which include, but are 
not limited to, conditions affecting hematopoietic development and metabolic diseases. 

35 Similarly, polypeptides and antibodies directed to these polypeptides are useful in 
providing immunological probes for differential identification of the tissue(s) or cell 
type(s). For a number of disorders of the above tissues or cells, particularly of the 
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hepatic system, and fetal hematopoietic system, expression of this gene at significantly 
higher or lower levels may be routinely detected in certain tissues (e.g., cancerous and 
wounded tissues) or bodily fluids (e.g., serum, plasma, urine, synovial fluid or spinal 
fluid) or another tissue or cell sample taken from an individual having such a disorder. 
5 relative to the standard gene expression level, i.e., the expression level in healthy tissue 
or bodily fluid from an individual not having the disorder. Preferred epitopes include 
those comprising a sequence shown in SEQ ID NO: 401 as residues: His-7 to Trp-17, 
Leu- 19 to Lys-27, Pro-33 to Gly-44, Lys-68 to Gly-74, Lys-85 to Cys-95. 



10 for treatment/diagnosis of diseases of the developing liver and hematopoietic system, 
and act as a growth differentiation factor for hematopoietic stem cells. 

FEATURES OF PROTEIN ENCODED BY GENE NO: 169 

The polypeptide encoded by this gene is believed to be a membrane bound 
15 receptor. The extracellular domain of which is expected to consist of the following 
amino acid sequence: 

RILLVKYSANEENKYDYLPTTVNVCSELVKLVFCVLVSFCVIKKDHQSRNLKY 

ASWKEFSDFMKWSIPAFLYFLDNLIVFYVLSYLQPAMAVIFSNFSirTTALLFRIV 

LKXRLNWIQWASLLTLFLSIVALTAGTKTLQHNLAGRGFHHDAFFSPSNSCLL 

20 FRNECPRKDNCTAKEWTFPEAKWNTTARVFSHIRLGMGHVLIIVQCFISSMANI 
YNEKILKEGNQLTEXIFIQNSKLYFFGILFNGLTLGLQRSNRDQIKNCGFFYGH 
S (SEQ ID NO:732). Thus, preferred polypeptides encoded by this gene comprise the 
extracellular domain as shown above. It will be recognized, however, that deletions of 
either end of the extracellular domain up to the first cysteine from the N-terminus and 

25 the first cysteine of the C-terminus, is expected to retain the biological functions of the 
full-length extracellular domain because the cysteines are thought to be responsible for 
providing secondary structure to the molecule. Thus, deletions of one or more amino 
acids from either end (or both ends) of the extracellular domain are contemplated. Of 
course, further deletions including the cysteines are also contemplated as useful as such 

30 polypeptides is expected to have immunological properties such as the ability to evoke 
and immune response. Polynucleotides encoding all of the foregoing polypeptides are 
provided. 

This gene is expressed primarily in human osteoclastoma and to a lesser extent 
in hippocampus and chondrosarcoma. 
35 Therefore, polynucleotides and polypeptides of the invention are useful as 

reagents for differential identification of the tissue(s) or cell type(s) present in a 
biological sample and for diagnosis of diseases and conditions, which include, but are 



The tissue distribution indicates that the protein products of this gene are useful 
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not limited to, cancers, particularly those of the bone and connective tissues. Similarly, 
polypeptides and antibodies directed to these polypeptides are useful in providing 
immunological probes for differential identification of the tissue(s) or cell type(s). For 
a number of disorders of the above tissues or cells, particularly of the skeletal system, 
5 expression of this gene at significantly higher or lower levels may be routinely detected 
in certain tissues (e.g.. cancerous and wounded tissues) or bodily fluids (e.g., serum, 
plasma, urine, synovial fluid or spinal fluid) or another tissue or cell sample taken from 
an individual having such a disorder, relative to the standard gene expression level, i.e. 
the expression level in healthy tissue or bodily fluid from an individual not having the 
10 disorder. Preferred epitopes include those comprising a sequence shown in SEQ ID 
NO: 402 as residues: Met-1 to Cys-6, Ala-41 to Tyr-49, Lys-76 to Lys-84. 

The tissue distribution indicates that the protein products of this gene are useful 
for diagnosis of cancers of the bone and connective tissues, and may act as growth 
factors for cells involved in bone or connective tissue growth. 

15 

FEATURES OF PROTEIN ENCODED BY GENE NO: 170 

Preferred polypeptides encoded by this gene comprising the following amino 
acid sequence: 

NSVPNLQTLAVLTEAIGPEPAIPRXPREPPVATSTPATPSAGPQPLPTGTV 

20 LVPGGPAPPCLGEAWALLLPPCRPSLTSCFWSPRPSPWKETGV (SEQ ID 
NO:733). Polynucleotides encoding such polypeptides are also provided herein. 
This gene is expressed primarily in hematopoietic progenitor cells. 
Therefore, polynucleotides and polypeptides of the invention are useful as 
reagents for differential identification of the tissue(s) or cell type(s) present in a 

25 biological sample and for diagnosis of diseases and conditions, which include, but are 
not limited to, diseases of the blood including cancer and autoimmune disorders. 
Similarly, polypeptides and antibodies directed to these polypeptides are useful in 
providing immunological probes for differential identification of the tissue(s) or cell 
type(s). For a number of disorders of the above tissues or cells, particularly of the 

30 blood/circulatory system, expression of this gene at significantly higher or lower levels 
may be routinely detected in certain tissues (e.g., cancerous and wounded tissues) or 
bodily fluids (e.g., serum, plasma, urine, synovial fluid or spinal fluid) or another 
tissue or cell sample taken from an individual having such a disorder, relative to the 
standard gene expression level, i.e., the expression level in healthy tissue or bodily 

35 fluid from an individual not having the disorder. Preferred epitopes include those 

comprising a sequence shown in SEQ ID NO: 403 as residues: Gln-4 to His- 10, Pro-25 
to His-32. 
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The tissue distribution indicates that the protein products of this gene are useful 
for diagnosis of diseases involving growth differentiation of hematopoietic cells. 

FEATURES OF PROTEIN ENCODED BY GENE NO: 171 

5 Preferred polypeptides encoded by this gene comprise the following amino acid 

sequences: ALQLAFYPDAVEEWLEENVHPSLQRLQXLLQDLSEVSAPP (SEQ ID 
NO:734); and/or CHPPALAGTLLRTPEGRAHARGLLLEAGGA (SEQ ID NO:735). 
Polynucleotides encoding such polypeptides are also provided. The protein product of 
this gene shares sequence homology with metallothionines. Thus, polypeptide encoded 
10 by this gene are expected to have metallothionine activity, such activities are known in 
the art and described elsewhere herein. 

This gene is expressed primarily in kidney cortex. 

Therefore, polynucleotides and polypeptides of the invention are useful as 
reagents for differential identification of the tissue(s) or cell type(s) present in a 

15 biological sample and for diagnosis of diseases and conditions, which include, but are 
not limited to, diseases of the kidney including cancer and renal dysfunction. Similarly, 
polypeptides and antibodies directed to these polypeptides are useful in providing 
immunological probes for differential identification of the tissue(s) or cell type(s). For 
a number of disorders of the above tissues or cells, particularly of the renal system, 

20 expression of this gene at significantly higher or lower levels may be routinely detected 
in certain tissues (e.g., cancerous and wounded tissues) or bodily fluids (e.g., serum, 
plasma, urine, synovial fluid or spinal fluid) or another tissue or cell sample taken from 
an individual having such a disorder, relative to the standard gene expression level, i.e., 
the expression level in healthy tissue or bodily fluid from an individual not having the 

25 disorder. Preferred epitopes include those comprising a sequence shown in SEQ ID 
NO: 404 as residues: Ser-47 to Gln-52. 

The tissue distribution indicates that polynucleotides and polypeptides 
corresponding to this gene are useful for treatment/diagnosis of diseases of the kidney 
including kidney failure. 

30 

FEATURES OF PROTEIN ENCODED BY GENE NO: 172 

This gene is expressed primarily in 12 week old early stage human. 

Therefore, polynucleotides and polypeptides of the invention are useful as 
reagents for differential identification of the tissue(s) or cell type(s) present in a 
35 biological sample and for diagnosis of diseases and conditions, which include, but are 
not limited to, developmental abnormalities. Similarly, polypeptides and antibodies 
directed to these polypeptides are useful in providing immunological probes for 
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differential identification of the tissue(s) or ceil type(s). For a number of disorders of 
the above tissues or cells, particularly of the developing embryo, expression of this 
gene at significantly higher or lower levels may be routinely detected in certain tissues 
(e.g., cancerous and wounded tissues) or bodily fluids (e.g., serum, plasma, urine, 
5 synovial fluid or spinal fluid) or another tissue or cell sample taken from an individual 
having such a disorder, relative to the standard gene expression level i.e., the 
expression level in healthy tissue or bodily fluid from an individual not having the 
disorder. Preferred epitopes include those comprising a sequence shown in SEQ ID 
NO: 405 as residues: Gln-31 to Thr-43, Gly-51 to Ser-58, Pro-65 to Pro-72. 
10 The tissue distribution indicates that polynucleotides and polypeptides 

corresponding to this gene are useful for treatment/diagnosis of developmental 
problems with fetal tissue. The gene may be involved in vital organ development in the 
early stage, especially hematopoiesis, cardiovascular system, and neural development. 

15 FEATURES OF PROTEIN ENCODED BY GENE NO: 173 

The translation product of this gene shares sequence homology with TGN38, an 
integral membrane protein previously shown to be predominantly localized to the trans- 
Golgi network (TGN) of cells. 

This gene is expressed primarily in developing embryo and to a lesser extent in 

20 cancer tissues including lymphoma, endometrial, protate and colon. 

Therefore, polynucleotides and polypeptides of the invention are useful as 
reagents for differential identification of the tissue(s) or cell type(s) present in a 
biological sample and for diagnosis of diseases and conditions, which include, but are 
not limited to, developmental abnormalities and cancer. Similarly, polypeptides and 

25 antibodies directed to these polypeptides are useful in providing immunological probes 
for differential identification of the tissue(s) or cell type(s). For a number of disorders 
of the above tissues or cells, particularly of the developing fetus, expression of this 
gene at significantly higher or lower levels may be routinely detected in certain tissues 
(e.g., cancerous and wounded tissues) or bodily fluids (e.g., serum, plasma, urine, 

30 synovial fluid or spinal fluid) or another tissue or cell sample taken from an individual 
having such a disorder, relative to the standard gene expression level, i.e., the 
expression level in healthy tissue or bodily fluid from an individual not having the 
disorder. Preferred epitopes include those comprising a sequence shown in SEQ ID 
NO: 406 as residues: His-65 to Ser~72, Pro-82 to Gly-91, Pro-98 to Glu-118, Ser-126 

35 to Gly-166, Pro- 180 to Asp- 188, Tyr-209 to Lys-214, Gln-220 to Leu-228. 

The tissue distribution and homology to an integral membrane protein indicates 
that polynucleotides and polypeptides corresponding to this gene are useful for 
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diagnosis of cancers and developmental abnormalities where aberrant expression relates 
to an abnormality. 

FEATURES OF PROTEIN ENCODED BY GENE NO: 174 

5 The translation product of this gene shares sequence homology with a dnaJ heat 

shock protein from E. coli which is allelic to sec63, a gene that affects transit of nascent 
secretory proteins across the endoplasmic reticulum in yeast. 

This gene is expressed primarily in Hodgkin's lymphoma and to a lesser extent 
in testes. 

10 Therefore, polynucleotides and polypeptides of the invention are useful as 

reagents for differential identification of the tissue(s) or cell type(s) present in a 
biological sample and for diagnosis of diseases and conditions, which include, but are 
not limited to, cancer. Similarly, polypeptides and antibodies directed to these 
polypeptides are useful in providing immunological probes for differential identification 

15 of the tissue(s) or cell type(s). For a number of disorders of the above tissues or cells, 
particularly of the immune system, expression of this gene at significantly higher or 
lower levels may be routinely detected in certain tissues (e.g., cancerous and wounded 
tissues) or bodily fluids (e.g., serum, plasma, urine, synovial fluid or spina! fluid) or 
another tissue or cell sample taken from an individual having such a disorder, relative to 

20 the standard gene expression level, i.e., the expression level in healthy tissue or bodily 
fluid from an individual not having the disorder. Preferred epitopes include those 
comprising a sequence shown in SEQ ID NO: 407 as residues: Thr-13 to Trp-21, Arg- 
74 to Asp-81. 

The tissue distribution and homology to dnaJ indicates that polynucleotides and 
25 polypeptides corresponding to this gene are useful as a diagnostic for cancer including 
Hodgkin's lymphoma. 

FEATURES OF PROTEIN ENCODED BY GENE NO: 175 

This gene is expressed primarily in endothelial cells and to a lesser extent in 
30 bone marrow stromal cells. 

Therefore, polynucleotides and polypeptides of the invention are useful as 
reagents for differential identification of the tissue(s) or cell typc(s) present in a 
biological sample and for diagnosis of diseases and conditions, which include, but are 
not limited to, diseases involving angiogenic abnormalities including diabetic 
35 retinopathy, macular degeneration, and other diseases including arteriosclerosis and 

cancer. Similarly, polypeptides and antibodies directed to these polypeptides are useful 
in providing immunological probes for differential identification of the tissue(s) or cell 
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type(s). For a number of disorders of the above tissues or cells, particularly of the 
vascular system, expression of this gene at significantly higher or lower levels may be 
routinely detected in certain tissues (e.g., cancerous and wounded tissues) or bodily- 
fluids (e.g., serum, plasma, urine, synovial fluid or spinal fluid) or another tissue or 
5 cell sample taken from an individual having such a disorder, relative to the standard 

gene expression level, i.e., the expression level in healthy tissue or bodily fluid from an 
individual not having the disorder. 

The tissue distribution indicates that the protein products of this gene are useful 
for treating diseases where an increase or decrease in angiogenesis is indicated and as a 
10 factor in the wound healing process. 

FEATURES OF PROTEIN ENCODED BY GENE NO: 176 

The translation product of this gene shares sequence homology with MATS 
(mouse) which is thought to be important in regulating chloride conductance in cells 

1 5 (particularly in the breast) by modulating the response mediated by cAMP and protein 
kinase C to extracellular signals. 

This gene is expressed primarily in amniotic cells and hematopoeitic cells 
including macrophages. Neutrophils, T cells, TNF induced aortic endothelium and to a 
lesser extent in testes, TNF induced epithelial cells, and smooth muscle. 

20 Therefore, polynucleotides and polypeptides of the invention are useful as 

reagents for differential identification of the tissue(s) or cell type(s) present in a 
biological sample and for diagnosis of diseases and conditions, which include, but are 
not limited to, inflammatory responses mediated by T cells, macrophages, and/or 
neutrophils particularly those involving TNF, and also cancer. Similarly, polypeptides 

25 and antibodies directed to these polypeptides are useful in providing immunological 
probes for differential identification of the tissue(s) or cell type(s). For a number of 
disorders of the above tissues or cells, particularly of the immune system, expression of 
this gene at significantly higher or lower levels may be routinely detected in certain 
tissues (e.g., cancerous and wounded tissues) or bodily fluids (e.g., serum, plasma, 

30 urine, synovial fluid or spinal fluid) or another tissue or cell sample taken from an 
individual having such a disorder, relative to the standard gene expression level, i.e., 
the expression level in healthy tissue or bodily fluid from an individual not having the 
disorder. Preferred epitopes include those comprising a sequence shown in SEQ ID 
NO: 409 as residues: Thr-19 to Ala-33, Leu-54 to Asp-82, Pro-89 to Ala-97, Pro- 100 

35 to Lys-125, Ser-127 to Phe-135, Gly-164 to Leu- 169, Cys-173 to Arg-178. 

The tissue distribution and homology to mat-8 indicates that polynucleotides and 
polypeptides corresponding to this gene are useful for modifying inflammatory 
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responses to cytokines such as TNF and thus modifying the duration and/or seventy of 
inflammation. Polynucleotides and polypeptides derived from this gene are thought to 
be useful in the diagnosis and treatment of cancer. 

5 FEATURES OF PROTEIN ENCODED BY GENE NO: 177 

This gene is expressed primarily in endothelial cells. 
Therefore, polynucleotides and polypeptides of the invention are useful as 
reagents for differential identification of the tissue(s) or cell type(s) present in a 
biological sample and for diagnosis of diseases and conditions, which include, but are 

10 not limited to, vascular restenosis. Similarly, polypeptides and antibodies directed to 
these polypeptides are useful in providing immunological probes for differential 
identification of the tissue(s) or cell type(s). For a number of disorders of the above 
tissues or cells, particularly of the vascular system, expression of this gene at 
significantly higher or lower levels may be routinely detected in certain tissues (e.g., 

15 cancerous and wounded tissues) or bodily fluids (e.g., serum, plasma, urine, synovial 
fluid or spinal fluid) or another tissue or cell sample taken from an individual having 
such a disorder, relative to the standard gene expression level, i.e., the expression level 
in healthy tissue or bodily fluid from an individual not having the disorder. 

The tissue distribution indicates that polynucleotides and polypeptides 

20 corresponding to this gene are useful for treating diseases associated with vascular 
response to injury such as vascular restenosis following angioplasty.. 

FEATURES OF PROTEIN ENCODED BY GENE NO: 178 

One embodiment of the claimed invention comprises: 
25 MRPDWKAGAGPGGPPQKPAPSSQRKPPARPSAAAAAIAVAAAEEERRLRQRN 
RLRLEEDKPAVERCLEELVFGDVENDEDALLRRLRGPRVQEHEDSGDSEVENEA 
KGNrTPQKKPVWVDEEDEDEEMVDMMNNRFRKDMM 

RLKEEFQHAMGGVPAWAETTKRKTSSDDESEEDEDDLLQRTGNFISTSTSLPRG 
ILKMKNCQHANAERPTVARISICAVPSRCTDCDGCWD (SEQ ID NO:737); or 

30 CLEELVFGDVENDEDALLRRLRGPRVQEHEDSGDSEVENEAKGNTPPQKKPV 
WVDEEDEDEEMVDMMNNRFRKDMMKNASESK1.SKDNLKKRLK£EFQHAMG 
GVPAWAETTKRKTSSDDESEEDEDDLLQRTGNFISTSTSLPRGILKMKNCQHA 
NAERPTVARISICAVPSRCTDCDGC (SEQ ID NO: 738). LKEKJVRSFEVSPDGS 
FLLINGIAGYLHLLAMKTKELIGSMKINGRVAASTFSSDSKKVYASSGDGEVYV 

35 WDVNSRKCLNRFVDEGSLYGLSIATSRNGQYVACGSNCGVVNIYNQDSCLQE 
TNPKPIKAIMNTVTGVTSLTFNPTTEILAIASEKMKEAVRLVHLPSCTVFSN 
KNKNISHVHTMDFSPRSGYFALGNEKGKALMYRLHHYSDF (SEQ ID NO:739); 
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and/or KINGRVAASTFSSDSKKVYASSGDGEVYVWDVNSRKCLNRFVDEGSL 
YGLSIATSRNGQYVACGSNCGVVNIYNQDSCLQETNPKPIKAIMNLVTGVTSLT 
FNPTTEILAIASEKMKEAVRLVHLPSCTVFSNFPVIKNKNISFIVHTMDFSPRSG 
YFALGNEKGKAL (SEQ ID NO:740). 
5 This gene is expressed primarily in epidydimus and endometrial tumors and to a 

lesser extent in T cell lymphoma and cell lines derived from colon cancer. 

Therefore, polynucleotides and polypeptides of the invention are useful as 
reagents for differential identification of the tissuc(s) or cell type(s) present in a 
biological sample and for diagnosis of diseases and conditions, which include, but are 

10 not limited to, tumors of the reproductive organs including testis and endometrial cells. 
Similarly, polypeptides and antibodies directed to these polypeptides are useful in 
providing immunological probes for differential identification of the tissue(s) or cell 
type(s). For a number of disorders of the above tissues or cells, particularly of the 
reproductive system, expression of this gene at significantly higher or lower levels may 

1 5 be routinely detected in certain tissues (e.g., cancerous and wounded tissues) or bodily 
fluids (e.g., serum, plasma, urine, synovial fluid or spinal fluid) or another tissue or 
cell sample taken from an individual having such a disorder, relative to the standard 
gene expression level, i.e., the expression level in healthy tissue or bodily fluid from an 
individual not having the disorder. Preferred epitopes include those comprising a 

20 sequence shown in SEQ ID NO: 41 1 as residues: Ser-67 to Lys-72, Val-87 to Leu-93, 
Tyr-128 to Pro-141, Asp-204 to Gly-210. 

The tissue distribution indicates that the protein products of this gene are useful 
for treating tumors of the endometrium or epithelial tumors of the reproductive system. 

25 FEATURES OF PROTEIN ENCODED BY GENE NO: 179 

Preferred polypeptides encoded by this gene comprise the following amino acid 
sequence: 

MRILQLILLALATGLVGGETRIIKGFECKLHSQPWQAALFEKTRLLCGATLIAPR 
WLLTAAHCLKPRYIVHLGQHNLQKEEGCEQTRTATESFPHPGFNNSLPNKDH 

30 RNDLMLVKMASPVSITWAVRPLTLSSRCVTAGTSCSFPAGAARPDPSYACLTPC 
DAPTSPSLSTRSVRTPTPATSQTPWCVPACRKGARTPARVTPGALWSVTSLFKA 
LSPGARIRVRSPESLVSTRKSANMWTGSRRR (SEQ ID NO:741); ETRUKGFEC 
K1.HSQPWQAALFEKTRLLCGATLIAPRWLLTAAHCLKPRYIVHLGQHNLQKEE 
GCEQTRTATESFPHPGFNNSLPNKDHRNDIMLVKJvlASPVSITWAVRPLTLSSR 

35 CVTAGTSCSFPAGAARPDPSYACLTPCDAPTSPSLSTRSVRTPTPATSQTPWCVP 
ACRKGARTPARVTPGALWSVTSLFKALSPGARIRVRSPESLVSTRKSANMWTG 
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SRRR (SEQ ID NO:742): or CKLHSQPWQAALFEKTRLLCGATLIAPRWLLT 
AAHCLKPRYIVHLGQHNLQKEEGCEQTRTATESFPHPGFNS 
(SEQ ID NO:743). The translation product of this gene shares sequence homology 
with neuropsin a novel serine protease which is thought to be important in modulating 
5 extracellular signaling pathways in the brain. Owing to the structural similarity to other 
serine proteases the protein products of this gene are expected to have serine protease 
activity which may be assayed by methods known in the art and described elsewhere 
herein. 

This gene is expressed primarily in endometrial tumor and to a lesser extent in 

10 colon cancer, benign hypertrophic prostate, and thymus. 

Therefore, polynucleotides and polypeptides of the invention are useful as 
reagents for differential identification of the tissue(s) or cell type(s) present in a 
biological sample and for diagnosis of diseases and conditions, which include, but are 
not limited to, cancers of the endometrium or colon and benign hypertrophy of the 

15 prostate. Similarly, polypeptides and antibodies directed to these polypeptides are 

useful in providing immunological probes for differential identification of the tissue(s) 
or cell type(s). For a number of disorders of the above tissues or cells, particularly of 
the urogenital or reproductive systems, expression of this gene at significantly higher or 
lower levels may be routinely detected in certain tissues (e.g., cancerous and wounded 

20 tissues) or bodily fluids (e.g., serum, plasma, urine, synovial fluid or spinal fluid) or 
another tissue or cell sample taken from an individual having such a disorder, relative to 
the standard gene expression level, i.e., the expression level in healthy tissue or bodily 
fluid from an individual not having the disorder. Preferred epitopes include those 
comprising a sequence shown in SEQ ID NO: 412 as residues: Gly-12 to Ser-22, Pro- 

25 34 to Ser-53. 

The tissue distribution and homology to serine proteases indicates that 
polynucleotides and polypeptides corresponding to this gene are useful for diagnosing 
or treating hperproliferative disorders such as cancer of the endometrium or colon and 
hyperplasia of the prostate. 

30 

FEATURES OF PROTEIN ENCODED BY GENE NO: 180 

Preferred polypeptide encoded by this gene comprise the following amino acid 
sequence: VLQGRYFSPILEMRRLRPEGXXNLPGGSRAQKEPRQDLTLVLWPHC 
PHFAMTRSYVPTKQCMVQGSFYCIFIFKGPVQNWC (SEQ ID NO:744). 
35 Polynucleotides encoding such polypeptide are also provided. 
This gene is expressed primarily in fetal brain 
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Therefore, polynucleotides and polypeptides of the invention ore useful as 
reagents for differential identification of the tissue(s) or cell type(s) present in a 
biological sample and for diagnosis of diseases and conditions, which include, but arc 
not limited to, identifying and expanding stem cells in the CNS. Similarly, polypeptides 
5 and antibodies directed to these polypeptides are useful in providing immunological 
probes for differential identification of the tissue(s) or cell type(s). For a number of 
disorders of the above tissues or cells, particularly of the nervous system, expression of 
this gene at significantly higher or lower levels may be routinely delected in certain 
tissues (e.g., cancerous and wounded tissues) or bodily fluids (e.g., serum, plasma, 
10 urine, synovial fluid or spinal fluid) or another tissue or cell sample taken from an 

individual having such a disorder, relative to the standard gene expression level, i.e., 
the expression level in healthy tissue or bodily fluid from an individual not having the 
disorder. 

The tissue distribution indicates that the protein products of this gene are useful 
15 for detecting and expanding stem cell populations in the (or of the) central nervous 
system. 

FEATURES OF PROTEIN ENCODED BY GENE NO: 181 

This gene is expressed primarily in early stage human brain and a stromal cell 

20 line. 

Therefore, polynucleotides and polypeptides of the invention are useful as 
reagents for differential identification of the tissue(s) or cell type(s) present in a 
biological sample and for diagnosis of diseases and conditions, which include, but are 
not limited to, developmental abnormalities of the CNS. Similarly, polypeptides and 

25 antibodies directed to these polypeptides are useful in providing immunological probes 
for differential identification of the tissue(s) or cell type(s). For a number of disorders 
of the above tissues or cells, particularly of the central nervous system, expression of 
this gene at significantly higher or lower levels may be routinely detected in certain 
tissues (e.g., cancerous and wounded tissues) or bodily fluids (e.g., serum, plasma, 

30 urine, synovial fluid or spinal fluid) or another tissue or cell sample taken from an 
individual having such a disorder, relative to the standard gene expression level, i.e., 
the expression level in healthy tissue or bodily fluid from an individual not having the 
disorder. Preferred epitopes include those comprising a sequence shown in SEQ ID 
NO: 414 as residues: Gln-42 to Gln-47, Gln-54 to Pro-60. 

35 The tissue distribution indicates that the protein products of this gene play a role 

in the development of the central nervous system. Therefore this gene and its products 
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are useful for diagnosing or treating developmental abnormalities of the central nervous 
system. 

FEATURES OF PROTEIN ENCODED BY GENE NO: 182 

5 Preferred polypeptides encoded by this gene comprise the following amino acid 

sequence: 

MPIIDQVNTELHDFMQSAEVGTIFALSWLITWFGHVLSDFRHVVRLYDF 

FLACHPLMPIYFAAVIVLYREQEVLDCDCDMASVHHLLSQIPQDLPYETLISRXE 

TFLFSFPHPNLLGRPLPNSKLRGRQPLLSKTLSWHQPSRGLIWCCGSGXRGLL 

10 RPEDRTKDVLTKPRTNRFVKLAVMGLTVALGAAALAVVKSALEWAPKFQLQL 
FP (SEQ ID NO:745); or CPEFFIPATLPCPFVFAFTSEASSRA YLTQRGPGGLAQ 
NLMPLPVGFWMGSLPPPWCWRKWVSEACSCFC (SEQ ID NO:746) These 
polypeptides are structurally similar to various TGF-beta family members. Thus, this 
polypeptide is expected to have a variety of activities in the modulation of cell growth 

15 and proliferation. 

This gene is expressed primarily in osteoclastoma, microvascular endothelium, 
and bone marrow derived cell lines. 

Therefore, polynucleotides and polypeptides of the invention are useful as 
reagents for differential identification of the tissue(s) or cell typc(s) present in a 

20 biological sample and for diagnosis of diseases and conditions, which include, but are 
not limited to, hematological diseases particularly involving aberrant proliferation of 
stem cells. Similarly, polypeptides and antibodies directed to these polypeptides are 
useful in providing immunological probes for differential identification of the tissue(s) 
or cell type(s). For a number of disorders of the above tissues or cells, particularly of 

25 the immune system, expression of this gene at significantly higher or lower levels may 
be routinely detected in certain tissues (e.g., cancerous and wounded tissues) or bodily 
fluids (e.g., serum, plasma, urine, synovial fluid or spinal fluid) or another tissue or 
cell sample taken from an individual having such a disorder, relative to the standard 
gene expression level, i.e., the expression level in healthy tissue or bodily fluid from an 

30 individual not having the disorder. Preferred epitopes include those comprising a 
sequence shown in SEQ ID NO: 415 as residues: Ser-33 to Ala-39. 

The tissue distribution indicates that the protein products of this gene is useful 
for treating disorders of the progenitors of the immune system. Applications include in 
vivo expansion of progenitor cells, ex vivo expansion of progenitor cells, or the 

35 treatment of tumors of the circulatory system, such as lymphomas. 
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FEATURES OF PROTEIN ENCODED BY GENE NO: 183 

This gene maps to chromosome 17 and therefore, polynucleotides of the 
invention can be used in linkage analysis as a marker for chromosome 17. In specific 
embodiments, polypeptides of the invention comprise the sequence: 
5 GFGSVSAAGRRSGGTWQPVQ (SEQ ID NO:747); PGGLAVGSRWWSRSLT 
(SEQ ID NO:748); LEPS RQRRPRRRGGTS RPETDQRA KC WRQL (SEQ ID 
NO:749); and7or VCLRCQNRMEN (SEQ ID NO:750). In further specific 
embodiments, polypeptides of the invention comprise the sequence: MAACTARRPGR 
GQPLVVPVADXGPVAKAALCAAXAGAFSPASTTTTRRHLSSRNRPEGKVLETV 

10 GVFEVPKQNGKYETGQLFLHSIFGYRGVVLFPWQARLXDRDVASAAPEKAEN 
PAGHGSKEVKGKTHTYYQVLIDARDCPHISQRSQTEAVTFLANHDDSRALYAIP 
GLDYVSHEDILPYTSTDQVPIQHELFERFLLYDQTKAPPFVARETLRAWQEKNH 
PWLELSDVHRETTEN1RVTVIPFYMGMREAQNSHVYWRYCIRLENLDSDVVQ 
LRERHWRIFSLSGTLETVRGRGVVGREPVLSKEQPAFQYSSHVSLQASSGHMW 

1 5 GTFRFERPDGSHFDVRIPPFSLESNKDEKTPPSGLHW (SEQ ID NO:75 1 ); 
MAACTARRPGRGQPLV VPVADXGPVAKAALCAA (SEQ ID NO:752); 
VLETVGVFEVPKQNGKYETGQLFLHSIFGYRGVVL (SEQ ID NO:757); 
GLDYVSHEDILPYTST (SEQ ID NO:758); DVHRETTENIRVTVIPFYM (SEQ ID 
NO:759); WWRYCIRLENLDSDVVQLRER (SEQ ID NO:760); and/or PAFQYSS 

20 HVSLQASSGHMWGTFRFER (SEQ ID NO:761). Polynucleotides encoding these 
polypeptides are also encompassed by the invention. 

This gene is expressed primarily in gall bladder, prostate, and fetal brain, and to 
a lesser extent in a few tumor and fetal tissues. 

Therefore, polynucleotides and polypeptides of the invention are useful as 

25 reagents for differential identification of the tissue(s) or cell type(s) present in a 

biological sample and for diagnosis of diseases and conditions, which include, but are 
not limited to, growth related disorders such as cancers. Similarly, polypeptides and 
antibodies directed to these polypeptides are useful in providing immunological probes 
for differential identification of the tissue(s) or cell type(s). For a number of disorders 

30 of the above tissues or cells, particularly of the prostate, gall bladder, and fetal brain, 
expression of this gene at significantly higher or lower levels may be routinely detected 
in certain tissues (e.g., cancerous and wounded tissues) or bodily fluids (e.g., serum, 
plasma, urine, synovial fluid or spinal fluid) or another tissue or cell sample taken from 
an individual having such a disorder, relative to the standard gene expression level, i.e., 

35 the expression level in healthy tissue or bodily fluid from an individual not having the 
disorder. 
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The tissue distribution indicates that polynucleotides and polypeptides 
corresponding to this gene are useful for diagnosis and treatment of growth- related 
disorders, such cancers. 

5 FEATURES OF PROTEIN ENCODED BY GENE NO: 184 

In specific embodiments, polypeptides of the invention comprise the 
sequence:SLCCPEGAEGC (SEQ ID NO:762) and/or QLKKTHYDRPCP (SEQ ID 
NO:763). Polynucleotides encoding these polypeptides are also encompassed by the 
invention. 

10 This gene is expressed primarily in stromal cell, tonsil, and glioblastoma and to 

a lesser extent in some other tissues. 

Therefore, polynucleotides and polypeptides of the invention are useful as 
reagents for differential identification of the tissue(s) or cell type(s) present in a 
biological sample and for diagnosis of diseases and conditions, which include, but are 

15 not limited to, immune and inflammatory disorders and glioblastoma. Similarly, 
polypeptides and antibodies directed to these polypeptides are useful in providing 
immunological probes for differential identification of the tissue(s) or cell type(s). For 
a number of disorders of the above tissues or cells, particularly of the stromal cells, 
tonsil, and glioblastoma expression of this gene at significantly higher or lower levels 

20 may be routinely detected in certain tissues (e.g., cancerous and wounded tissues) or 
bodily fluids (e.g., serum, plasma, urine, synovial fluid or spinal fluid) or another 
tissue or cell sample taken from an individual having such a disorder, relative to the 
standard gene expression level, i.e., the expression level in healthy tissue or bodily 
fluid from an individual not having the disorder. Additionally, it is believed that the 

25 product of this gene regulates pancreatic cell differentiation into beta cells. Accordingly, 
polynucleotides and polypeptides of the invention are useful in the treatment of insulin- 
dependent diabetes mellitus and associated conditions e.g. pancreatic hypofunction and 
the prevention, as well as the treatment of undifferentiated type pancreatic cancers. 
Preferred epitopes include those comprising a sequence shown in SEQ ID NO: 417 as 

30 residues: Pn>27 to Ala-32. 

The tissue distribution indicates that polynucleotides and polypeptides 
corresponding to this gene are useful for diagnosis and treatment of immune and 
inflammatory disorders and glioblastoma. 



35 



FEATURES OF PROTEIN ENCODED BY GENE NO: 185 

This gene is expressed primarily in hepatocellular carcinoma and to a lesser 
extent in other tissues. 
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Therefore, polynucleotides and polypeptides of the invention are useful as 
reagents for differentia] identification of the tissue(s) or cell type(s) present in a 
biological sample and for diagnosis of diseases and conditions, which include, but are 
not limited to, liver diseases. Similarly, polypeptides and antibodies directed to these 
5 polypeptides are useful in providing immunological probes for differential identification 
of the tissue(s) or cell type(s). For a number of disorders of the above tissues or cells, 
particularly of the liver, expression of this gene at significantly higher or lower levels 
may be routinely detected in certain tissues (e.g., cancerous and wounded tissues) or 
bodily fluids (e.g., serum, plasma, urine, synovial fluid or spinal fluid) or another 

10 tissue or cell sample taken from an individual having such a disorder, relative to the 
standard gene expression level, i.e., the expression level in healthy tissue or bodily 
fluid from an individual not having the disorder. Preferred epitopes include those 
comprising a sequence shown in SEQ ID NO: 418 as residues: Gly-32 to Lys-39. 
The tissue distribution indicates that polynucleotides and polypeptides 

15 corresponding to this gene are useful for diagnosis and treatment of liver diseases. 

FEATURES OF PROTEIN ENCODED BY GENE NO: 186 

This gene is expressed primarily in hippocampus and to a lesser extent in other 

tissues. 

20 Therefore, polynucleotides and polypeptides of the invention are useful as 

reagents for differential identification of the tissue(s) or cell type(s) present in a 
biological sample and for diagnosis of diseases and conditions, which include, but are 
not limited to, neutronal disorders. Similarly, polypeptides and antibodies directed to 
these polypeptides are useful in providing immunological probes for differential 

25 identification of the tissue(s) or cell type(s). For a number of disorders of the above 

tissues or cells, particularly of the hippocampus, expression of this gene at significantly 
higher or lower levels may be routinely detected in certain tissues (e.g., cancerous and 
wounded tissues) or bodily fluids (e.g., serum, plasma, urine, synovial fluid or spinal 
fluid) or another tissue or cell sample taken from an individual having such a disorder, 

30 relative to the standard gene expression level, i.e., the expression level in healthy tissue 
or bodily fluid from an individual not having the disorder. 

The tissue distribution indicates that polynucleotides and polypeptides 
corresponding to this gene are useful for diagnosis and treatment of neuronal disorders. 



35 



FEATURES OF PROTEIN ENCODED BY GENE NO: 187 

This gene is expressed primarily in bone cancer and hippocampus and to a 
lesser extent in osteoclastoma and other tissues. 
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Therefore, polynucleotides and polypeptides of the invention are useful as 
reagents for differential identification of the tissue(s) or cell type(s) present in a 
biological sample and for diagnosis of diseases and conditions, which include, but are 
not limited to, bone-related disorders and neuronal diseases. Similarly, polypeptides 
5 and antibodies directed to these polypeptides are useful in providing immunological 
probes for differential identification of the tissue(s) or cell type(s). For a number of 
disorders of the above tissues or cells, particularly of the bone, ostoeclast, and 
hippocampus, expression of this gene at significantly higher or lower levels may be 
routinely detected in certain tissues (e.g., cancerous and wounded tissues) or bodily 

10 fluids (e.g., serum, plasma, urine, synovial fluid or spinal fluid) or another tissue or 
cell sample taken from an individual having such a disorder, relative to the standard 
gene expression level, i.e., the expression level in healthy tissue or bodily fluid from an 
individual not having the disorder. 

The tissue distribution indicates that polynucleotides and polypeptides 

15 corresponding to this gene are useful for diagnosis and treatment of bone-related 
disorders and neuronal diseases. 

FEATURES OF PROTEIN ENCODED BY GENE NO: 188 

This gene maps to chromosome 4 and therefore polynucleotides of the invention 
20 can be used in linkage analysis as a marker for chromosome 4. 

This gene is expressed primarily in neuronal tissues such as hippocampus, 
spinal cord, and hypothalamus and to a lesser extent in a few other tissues such as 
ovary. 

Therefore, polynucleotides and polypeptides of the invention are useful as 
25 reagents for differential identification of the tissue(s) or cell type(s) present in a 

biological sample and for diagnosis of diseases and conditions, which include, but are 
not limited to, neuronal diseases. Similarly, polypeptides and antibodies directed to 
these polypeptides are useful in providing immunological probes for differential 
identification of the tissue(s) or cell type(s). For a number of disorders of the above 
30 tissues or cells, particularly of the neuronal tissues, expression of this gene at 

significantly higher or lower levels may be routinely detected in certain tissues (e.g., 
cancerous and wounded tissues) or bodily fluids (e.g., serum, plasma, urine, synovial 
fluid or spinal fluid) or another tissue or cell sample taken from an individual having 
such a disorder, relative to the standard gene expression level, i.e., the expression level 
35 in healthy tissue or bodily fluid from an individual not having the disorder. 

The tissue distribution indicates that polynucleotides and polypeptides 
corresponding to this gene are useful for diagnosis and treatment of neuronal disorders. 
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FEATURES OF PROTEIN ENCODED BY GENE NO: 189 

This gene maps to chromosome 10, therefore, polynucleotides of the invention 
can be used in linkage analysis as a marker for chromosome 10. 
5 This gene is expressed primarily in neuronal tissues and immune tissues, and to 

a lesser extent in a few other tissues such as skin tumor, lung etc. 

Therefore, polynucleotides and polypeptides of the invention are useful as 
reagents for differential identification of the tissue(s) or cell type(s) present in a 
biological sample and for diagnosis of diseases and conditions, which include, but are 

10 not limited to, neuronal and immune -related disorders. Similarly, polypeptides and 

antibodies directed to these polypeptides are useful in providing immunological probes 
for differential identification of the tissue(s) or cell type(s). For a number of disorders 
of the above tissues or cells, particularly of the neuronal and immune-related tissues, 
expression of this gene at significantly higher or lower levels may be routinely detected 

15 in certain tissues (e.g., cancerous and wounded tissues) or bodily fluids (e.g., serum, 
plasma, urine, synovial fluid or spinal fluid) or another tissue or cell sample taken from 
an individual having such a disorder, relative to the standard gene expression level, i.e., 
the expression level in healthy tissue or bodily fluid from an individual not having the 
disorder. Preferred epitopes include those comprising a sequence shown in SEQ ID 

20 NO: 422 as residues: Pro- 19 to Asp-25. 

The tissue distribution indicates that polynucleotides and polypeptides 
corresponding to this gene are useful for diagnosis and treatment of neuronal and 
immune-related disorders. 



25 FEATURES OF PROTEIN ENCODED BY GENE NO: 190 

The translation product of this gene shares sequence homology with human 
N33, a gene located in a homozygously deleted region of human metastatic prostate 
cancer which is thought to be important in prevention of prostate cancer. In specific 
embodiments, polypeptides of the invention comprise the sequence: 

30 AQRKJ<EMVLSEKVSQLMEWTNKRP 
LQLHR(^VVCKQADEEFQ1LANSWRY 

NSAPTnNFPAKGKPKRGDTYELQVRGFSAEQIARWIADRTDVNIRVIRPPNMA 
ARWRFWCVSVT (SEQ ID NO:765); MVVALLIVCDVPSAS (SEQ ID NO:766); 
AQRKKEMVLSEKVSQL (SEQ ID NO:767); MEWTNKRPVIRMNGDKF (SEQ 
35 ID:768):RRLVKAPPRNYSVIVMFTALQLHRQCVVCKQADEEFQILANSWRY 
SSAFTNRIFFA (SEQ ID NO:769); MVDFDEGSDVFQMLNMNSAPTFINFPAK 
GKP (SEQ ID NO:770); KRGDTYELQVRGFSAEQIARWIADRTDVNIRVIRPPN 
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(SEQ ID NO:77l ); and/or YAGPLMLGLLLAVIGGLVYLRRVIWNFSLIKLDGLLQL 
CVLCLL (SEQ ID NO:772). Polynucleotides encoding these polypeptides are also 
encompassed by the invention. 

This gene is expressed primarily in infant adrenal gland prostate cell line and to 
5 a lesser extent in a few other tissues like liver, smooth muscle etc. 

Therefore, polynucleotides and polypeptides of the invention are useful as 
reagents for differential identification of the tissue(s) or cell type(s) present in a 
biological sample and for diagnosis of diseases and conditions, which include, but are 
not limited to, prostate cancer and endocrine disorders. Similarly, polypeptides and 

10 antibodies directed to these polypeptides are useful in providing immunological probes 
for differential identification of the tissue(s) or cell type(s). For a number of disorders 
of the above tissues or cells, particularly of the prostate and adrenal gland, expression 
of this gene at significantly higher or lower levels may be routinely detected in certain 
tissues (e.g., cancerous and wounded tissues) or bodily fluids (e.g., serum, plasma, 

15 urine, synovial fluid or spinal fluid) or another tissue or cell sample taken from an 
individual having such a disorder, relative to the standard gene expression level, i.e., 
the expression level in healthy tissue or bodily fluid from an individual not having the 
disorder. Preferred epitopes include those comprising a sequence shown in SEQ ID 
NO: 423 as residues: Pro-34 to Gly-43, Arg-1 13 to Pro- 120. 

20 The tissue distribution and homology to N33 indicates that polynucleotides and 

polypeptides corresponding to this gene are useful for diagnosis and treatment for 
prostate cancer and endocrine disorders. 

FEATURES OF PROTEIN ENCODED BY GENE NO: 191 

25 This gene is expressed primarily in T cell and to a lesser extent in fetal lung. 

Therefore, polynucleotides and polypeptides of the invention are useful as 
reagents for differential identification of the tissue(s) or cell type(s) present in a 
biological sample and for diagnosis of diseases and conditions, which include, but are 
not limited to, immune disorders. Similarly, polypeptides and antibodies directed to 

30 these polypeptides are useful in providing immunological probes for differential 

identification of the tissue(s) or cell type(s). For a number of disorders of the above 
tissues or cells, particularly of the immune, expression of this gene at significantly 
higher or lower levels may be routinely detected in certain tissues (e.g., cancerous and 
wounded tissues) or bodily fluids (e.g., serum, plasma, urine, synovial fluid or spinal 

35 fluid) or another tissue or cell sample taken from an individual having such a disorder, 
relative to the standard gene expression level, i.e., the expression level in healthy tissue 
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or bodily fluid from an individual not having the disorder. Preferred epitopes include 
those comprising a sequence shown in SEQ ID NO: 424 as residues: Trp-3 to Phe-9. 

The tissue distribution indicates that polynucleotides and polypeptides 
corresponding to this gene are useful for diagnosis and treatment of immune disorders. 

5 

FEATURES OF PROTEIN ENCODED BY GENE NO: 192 

This gene maps to chromosome 6, therefore, polynucleotides of the invention 
can be used in linkage analysis as a marker for chromosome 6. Neural activity and 
neurotrophins induce synaptic remodeling in part by altering gene expression. This 

10 gene is believed to be a glycosylphoshatidylinositol-anchored protein encoded by a 
hippocampal gene and to possess neural activity. This molecule is believed to be 
expressed in postmitotic-differentiating neurons of the developing nervous system and 
neuronal structures associated with plasticity in the adult. Message of this gene is 
believed to be induced by neuronal activity and by the activity-regulated neurotrophins 

15 BDNF and NT-3. The product of this gene is believed to stimulate neurite outgrowth 
and arborization in primary embryonic hippocampal and cortical cultures and to act as a 
downstream effector of activity-induced neurite outgrowth. In specific embodiments, 
polypeptides of the invention comprise the sequence: DAVFKGFSDCLLKLGDS (SEQ 
ID NO:773); CQEGAKDMWDKLRKESKNLN (SEQ ID NO:774); 

20 VLLVSLSAALATWLSF (SEQ ID NO:775); MGLKLNGRY1SL1LAVQIAYLVQAVR 
A AG KCD A VFKGFS DCLLKLGDS (SEQ ID NO:776); PAAWDDKTNIKTVCTYW 
EDFHSCTVTALTDCQEGAKI)MWDKLRKESKNLNIQGSLFELCGSGNGAAGSL 
LPAFP VLLVSLSAALATWLSF (SEQ ID NO:777 ); and/or MGLKLNGRYISLILA 
VQIAYLVQAVRAAGKCDAVFKGFSDCLLKLGDSXXXXXPAAWDDKTNIKTVC 

25 TYWEDFHSCTWALTDCQEGAKDMWDKLRK^ 

GSLLPAFPVLLVSLSAALATWLSF (SEQ ID NO:778). Polynucleotides encoding 
this polypeptide are also encompassed by the invention. 

This gene is expressed primarily in human placenta, endometrial tumor and 
tissues of the central nervous system (CNS ). 

30 Therefore, polynucleotides and polypeptides of the invention are useful as 

reagents for differential identification of the tissue(s ) or cell type(s) present in a 
biological sample and for diagnosis of diseases and conditions, which include, but are 
not limited to, relating to reproductive disorders, cancers and neurological diseases. 
Similarly, polypeptides and antibodies directed to these polypeptides are useful in 

35 providing immunological probes for differential identification of the tissue(s) or cell 
type(s). For a number of disorders of the above tissues or cells, particularly of the 
reproductive and neurological disorders, expression of this gene at significantly higher 
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or lower levels may be routinely detected in certain tissues (e.g.. cancerous and 
wounded tissues) or bodily fluids (e.g., serum, plasma, urine, synovial fluid or spinal 
fluid) or another tissue or cell sample taken from an individual having such a disorder, 
relative to the standard gene expression level, i.e., the expression level in healthy tissue 
5 or bodily fluid from an individual not having the disorder. Preferred epitopes include 
those comprising a sequence shown in SEQ ID NO: 425 as residues: Asp-47 to Asp- 
63, His-75 to Tyr-80, Pro-83 to Tyr-89. 

The tissue distribution indicates that polynucleotides and polypeptides 
corresponding to this gene are useful for diagnosis and treatment of reproductive 
10 disorders such as endometrial tumors. Expression of this gene in tissues of the CNS 

and its strong homology to Neuritin suggest that the protein product from this gene may 
also be used in the treatment and diagnosis of neurological disorders and in the 
regeneration of neural tissues, e.g., following injury. 

15 FEATURES OF PROTEIN ENCODED BY GENE NO: 193 

The translation product of this gene shares sequence homology with tenascin 
which is thought to be important in development. The translation product of this gene is 
believed to be a ligand of the fibroblast growth factor family. FGF ligand activity is 
known in the art and can be assayed by methods known in the art and disclosed 
20 elsewhere herein. 

This gene is expressed primarily in endometrial tumors, and other types of 

tumors. 

Therefore, polynucleotides and polypeptides of the invention are useful as 
reagents for differential identification of the tissue(s) or cell type(s) present in a 

25 biological sample and for diagnosis of diseases and conditions, which include, but are 
not limited to, cancers. Similarly, polypeptides and antibodies directed to these 
polypeptides are useful in providing immunological probes for differential identification 
of the tissue(s) or cell type(s). For a number of disorders of the above tissues or cells, 
particularly of the cancer tissues, expression of this gene at significantly higher or lower 

30 levels may be routinely detected in certain tissues (e.g., cancerous and wounded 

tissues) or bodily fluids (e.g., serum, plasma, urine, synovial fluid or spinal fluid) or 
another tissue or cell sample taken from an individual having such a disorder, relative to 
the standard gene expression level, i.e., the expression level in healthy tissue or bodily 
fluid from an individual not having the disorder. Preferred epitopes include those 

35 comprising a sequence shown in SEQ ID NO: 426 as residues: Giy-29 to Glu-34, Arg- 
71 to Arg-76, Thr-176 to Cys-182, Gly-184 to Glu-199. 
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The tissue distribution and homology to tenascin indicates that polynucleotides 
and polypeptides corresponding to this gene are useful tor diagnosis and treatment of 
cancers. 

5 FEATURES OF PROTEIN ENCODED BY GENE NO: 194 

In specific embodiments, polypeptides of the invention comprise the sequence: 
MNSAAGFSHLDRRERVLKLGESFEKQPRCASTLC (SEQ ID NO: 779). 
Polynucleotides encoding these polypeptides are also encompassed by the invention. 
This gene is expressed primarily in fetal human lung and neutrophils. 

10 Therefore, polynucleotides and polypeptides of the invention are useful as 

reagents for differential identification of the tissue(s) or cell type(s) present in a 
biological sample and for diagnosis of diseases and conditions, which include, but are 
not limited to, lung development and respiratory disorders. Similarly, polypeptides and 
antibodies directed to these polypeptides are useful in providing immunological probes 

15 for differential identification of the tissue(s) or cell type(s). For a number of disorders 
of the above tissues or cells, particularly of the respiratory system, expression of this 
gene at significantly higher or lower levels may be routinely detected in certain tissues 
(e.g., cancerous and wounded tissues) or bodily fluids (e.g., serum, plasma, urine, 
synovial fluid or spinal fluid) or another tissue or cell sample taken from an individual 

20 having such a disorder, relative to the standard gene expression level, i.e., the 

expression level in healthy tissue or bodily fluid from an individual not having the 
disorder. 

The tissue distribution in fetal lung and neutrophils indicates that 
polynucleotides and polypeptides corresponding to this gene are useful for diagnosis 
25 and treatment of lung and immunity related diseases, for example, lung cancer, viral, 
fungal or bacterial infections (e.g. lesions caused by tuberculosis), inflammation (e.g. 
pneumonia), metabolic lesions etc. 



Therefore, polynucleotides and polypeptides of the invention are useful as 
reagents for differential identification of the tissue(s) or cell type(s) present in a 
biological sample and for diagnosis of diseases and conditions, which include, but are 
not limited to, immunal disorders. Similarly, polypeptides and antibodies directed to 
35 these polypeptides are useful in providing immunological probes for differential 

identification of the tissue(s) or cell type(s). For a number of disorders of the above 
tissues or cells, particularly of the immune system, expression of this gene at 



FEATURES OF PROTEIN ENCODED BY GENE NO: 195 

This gene is expressed primarily in breast lymph node. 
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significantly higher or lower levels may be routinely detected in certain tissues (e.g., 
cancerous and wounded tissues) or bodily fluids (e.g.. serum, plasma, urine, synovial 
fluid or spinal fluid) or another tissue or cell sample taken from an individual having 
such a disorder, relative to the standard gene expression level, i.e., the expression level 
5 in healthy tissue or bodily fluid from an individual not having the disorder. 

The tissue distribution indicates that polynucleotides and polypeptides 
corresponding to this gene are useful for the diagnosis and treatment of immunal 
disorders. 

10 FEATURES OF PROTEIN ENCODED BY GENE NO: 196 

This gene maps to chromosome 5 and accordingly, polynucleotides of the invention can 
be used in linkage analysis as a marker for chromosome 5. The translation product of 
this gene shares sequence homology with human M-phase phosphoprotein 4 which is 
thought to be important in phosphorylation and signal transduction processes. In 

15 specific embodiments, polypeptides of the invention comprise the sequence: 
TIYPTEEELQAVQKIVSITERALKLVSD (SEQ ID NO:780); RALKGVLRV 
GVLAKGLLLRGDRNVNLVLLC (SEQ ID NO:781); ALAALRHAKWFQARAN 
GLQSCVinRILRDLCQRVPTWS (SEQ ID NO:782); GDALRRVFECISSGIIL (SEQ 
ID NO:783); LAFRQIHKVLGMDPLP (SEQ ID NO:784); and/or T I Y PTEEELQ A VQ 

20 KIVSITERALKLVSDSLSEHEKNKNKEGDDKKEGGKDRALKGVLRVGVLAKG 
LLLRGDRNVNLVLLCSEKPSKTLLSRIAENLPKQLAVISPEKYDIKCAVSEAAIIL 
NSCVEPKMQVTITLTSPnREENMREGDVTSGMVKDPPDVLDRQKCLDALAALR 
HAKWFQARANGLQSCVHIRILRDLCQRVPTWSDFPSWAMELLVEKAISSASSP 
QSPGDALRRVFECISSGIILKGSPGLLDPCEKDPFDTLATMTDQQREDITSSAQFA 

25 LRLLAFRQIHKVLGMDPLPQMSQRFNIHNNRKRRRDSDGVDGFEAEGKKDKK 
DYDNF (SEQ ID NO:785). Polynucleotides encoding these polypeptides are also 
encompassed by the invention. 

This gene is expressed primarily in Human Hippocampus and to a lesser extent 
in Prostate, Human Frontal Cortex. 

30 Therefore, polynucleotides and polypeptides of the invention are useful as 

reagents for differential identification of the tissue(s) or cell type(s) present in a 
biological sample and for diagnosis of diseases and conditions, which include, but are 
not limited to, disorders related to reproductive system and nervous system. Similarly, 
polypeptides and antibodies directed to these polypeptides are useful in providing 

35 immunological probes for differential identification of the tissue(s) or cell type(s). For 
a number of disorders of the above tissues or cells, particularly of the reproductive 
system and nervous system, expression of this gene at significantly higher or lower 
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levels may be routinely detected in certain tissues (e.g., cancerous and wounded 
tissues) or bodily fluids (e.g., serum, plasma, urine, synovial fluid or spinal fluid) or 
another tissue or cell sample taken from an individual having such a disorder, relative to 
the standard gene expression level, i.e., the expression level in healthy tissue or bodily 
5 fluid from an individual not having the disorder. 

The tissue distribution and homology to human M-phase phosphoprotein 4 
indicates that polynucleotides and polypeptides corresponding to this gene are useful for 
the diagnosis and treatment of reproductive and nervous system disorders. 

10 FEATURES OF PROTEIN ENCODED BY GENE NO: 197 

In specific embodiments, polypeptides of the invention comprise the sequence: 
MGSQHSAAARPSSCRRKQEDDRDG (SEQ ID NO:786); 
LL AEREQEE A I AQFP Y VEFTGRDS ITCLTC (SEQ ID NO:787); and/or 
QGTGYIPTEQVNELVALIPHSDQRLRPQRTKQYV (SEQ ID NO:788). 

15 Polynucleotides encoding these polypeptides are also encompassed by the invention. 

This gene is expressed primarily in Human Primary Breast Cancer and to a 
lesser extent in Human Adult Spleen, Hodgkin's Lymphoma I, Salivary Gland. 

Therefore, polynucleotides and polypeptides of the invention are useful as 
reagents for differential identification of the tissue(s) or cell type(s) present in a 

20 biological sample and for diagnosis of diseases and conditions, which include, but are 
not limited to, cancer and immunal disorders. Similarly, polypeptides and antibodies 
directed to these polypeptides are useful in providing immunological probes for 
differential identification of the tissue(s) or cell type(s). For a number of disorders of 
the above tissues or cells, particularly of the cancer and immune system, expression of 

25 this gene at significantly higher or lower levels may be routinely detected in certain 
tissues (e.g., cancerous and wounded tissues) or bodily fluids (e.g., serum, plasma, 
urine, synovial fluid or spinal fluid) or another tissue or cell sample taken from an 
individual having such a disorder, relative to the standard gene expression level, i.e., 
the expression level in healthy tissue or bodily fluid from an individual not having the 

30 disorder. Preferred epitopes include those comprising a sequence shown in SEQ ID 
NO: 430 as residues: Ser-126 to Gly-138. 

The tissue distribution indicates that polynucleotides and polypeptides 
corresponding to this gene are useful for the diagnosis and treatment of cancer and 
immunal disorders. 



35 
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FEATURES OF PROTEIN ENCODED BY GENE NO: 198 

This gene is expressed primarily in monocytes. 

Therefore, polynucleotides and polypeptides of the invention are useful as 
reagents for differential identification of the tissue(s) or cell type(s) present in a 
5 biological sample and for diagnosis of diseases and conditions, which include, but are 
not limited to, blood cell disorders. Similarly, polypeptides and antibodies directed to 
these polypeptides are useful in providing immunological probes for differential 
identification of the tissue(s) or cell type(s). For a number of disorders of the above 
tissues or cells, particularly of the immune system, expression of this gene at 

10 significantly higher or lower levels may be routinely detected in certain tissues (e.g., 
cancerous and wounded tissues) or bodily fluids (e.g., serum, plasma, urine, synovial 
fluid or spinal fluid) or another tissue or cell sample taken from an individual having 
such a disorder, relative to the standard gene expression level, i.e., the expression level 
in healthy tissue or bodily fluid from an individual not having the disorder. 

15 The tissue distribution indicates that polynucleotides and polypeptides 

corresponding to this gene are useful for the diagnosis and treatment of blood cell 
disorders. 

FEATURES OF PROTEIN ENCODED BY GENE NO: 199 

20 This gene is expressed primarily in Human Ovary and Synovia and to a lesser 

extent in Human 8 Week Whole Embryo. 

Therefore, polynucleotides and polypeptides of the invention are useful as 
reagents for differential identification of the tissue(s) or cell type(s) present in a 
biological sample and for diagnosis of diseases and conditions, which include, but are 

25 not limited to, reproductive and developmental disorders. Similarly, polypeptides and 
antibodies directed to these polypeptides are useful in providing immunological probes 
for differential identification of the tissue(s) or cell type(s). For a number of disorders 
of the above tissues or cells, particularly of the reproductive and developmental system, 
expression of this gene at significantly higher or lower levels may be routinely detected 

30 in certain tissues (e.g., cancerous and wounded tissues) or bodily fluids (e.g., serum, 
plasma, urine, synovial fluid or spinal fluid) or another tissue or cell sample taken from 
an individual having such a disorder, relative to the standard gene expression level, i.e., 
the expression level in healthy tissue or bodily fluid from an individual not having the 
disorder. 

35 The tissue distribution indicates that polynucleotides and polypeptides 

corresponding to this gene are useful for the diagnosis and treatment of reproductive 
and developmental disorders. 



WO 98/54963 




[T/US98/1I422 



FEATURES OF PROTEIN ENCODED BY GENE NO: 200 

This gene maps to chromosome 8 and therefore polynucleotides of the invention 
can be used in linkage analysis as a marker for chromosome 8. The translation product 
5 of this gene shares limited sequence homology with collagen proline rich domain. 
This gene is expressed primarily in CNS. 

Therefore, polynucleotides and polypeptides of the invention are useful as 
reagents for differential identification of the tissue(s) or cell type(s) present in a 
biological sample and for diagnosis of diseases and conditions, which include, but are 

10 not limited to, neurological diseases. Similarly, polypeptides and antibodies directed to 
these polypeptides are useful in providing immunological probes for differential 
identification of the tissue(s) or cell type(s). For a number of disorders of the above 
tissues or cells, particularly of the nervous system, expression of this gene at 
significantly higher or lower levels may be routinely detected in certain tissues (e.g., 

15 cancerous and wounded tissues) or bodily fluids (e.g., serum, plasma, urine, synovial 
fluid or spinal fluid) or another tissue or cell sample taken from an individual having 
such a disorder, relative to the standard gene expression level, i.e., the expression level 
in healthy tissue or bodily fluid from an individual not having the disorder. Preferred 
epitopes include those comprising a sequence shown in SEQ ID NO: 433 as residues: 

20 Pro-35 to Asp-41. 

The tissue distribution indicates that polynucleotides and polypeptides 
corresponding to this gene are useful for diagnosis and treatment of neurological 
diseases. 

25 FEATURES OF PROTEIN ENCODED BY GENE NO: 201 

Translation product of this gene shares homology with a mammalian histone 
Hla protein. One embodiment for this gene is the polypeptide fragments comprising the 
following amino acid sequence: ARLNVGRESLKREMLKSQGVKVSESPMGAR 
HSSWPEGAAFCKKVQGAQMQFPPRR (SEQ ID NO:789); ARLNVGRESLKR 
30 EML (SEQ ID NO.790); LKSQGVKVSESPMGARHSSW (SEQ ID NO:791); 

AFCKKVQGAQMQFPPRR (SEQ ID NO:792). An additional embodiment is the 
polynucleotide fragments encoding these polypeptide (See Accession No. pirlS24178) 
fragments. 

This gene is expressed primarily in neutrophils. 
35 Therefore, polynucleotides and polypeptides of the invention are useful as 

reagents for differential identification of the tissue(s) or cell type(s) present in a 
biological sample and for diagnosis of diseases and conditions, which include, but are 
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not limited to, immune disorders. Similarly, polypeptides and antibodies directed to 
these polypeptides are useful in providing immunological probes for differential 
identification of the tissue(s) or cell type(s). For a number of disorders of the above 
tissues or cells, particularly of the immune system, expression of this gene at 
5 significantly higher or lower levels may be routinely detected in certain tissues (e.g., 
cancerous and wounded tissues) or bodily fluids (e.g., serum, plasma, urine, synovial 
fluid or spinal fluid) or another tissue or cell sample taken from an individual having 
such a disorder, relative to the standard gene expression level, i.e.. the expression level 
in healthy tissue or bodily fluid from an individual not having the disorder. 

10 The tissue distribution indicates that polynucleotides and polypeptides 

corresponding to this gene are useful for diagnosis and treatment of immune disorders. 
Since the gene is expressed in cells of lymphoid origin, the natural gene product may be 
involved in vital immune functions. Therefore it may be also used as an agent for 
immunological disorders including arthritis, asthma, immune deficiency diseases such 

15 as AIDS, and leukemia. 

FEATURES OF PROTEIN ENCODED BY GENE NO: 202 

This gene is expressed primarily in neutrophils. 

Therefore, polynucleotides and polypeptides of the invention are useful as 

20 reagents for differential identification of the tissue(s) or cell type(s) present in a 

biological sample and for diagnosis of diseases and conditions, which include, but are 
not limited to, immune disorders. Similarly, polypeptides and antibodies directed to 
these polypeptides are useful in providing immunological probes for differential 
identification of the tissue(s) or cell type(s). For a number of disorders of the above 

25 tissues or cells, particularly of the immune system, expression of this gene at 

significantly higher or lower levels may be routinely detected in certain tissues (e.g., 
cancerous and wounded tissues) or bodily fluids (e.g., serum, plasma, urine, synovial 
fluid or spinal fluid) or another tissue or cell sample taken from an individual having 
such a disorder, relative to the standard gene expression level, i.e., the expression level 

30 in healthy tissue or bodily fluid from an individual not having the disorder. 

The tissue distribution indicates that polynucleotides and polypeptides 
corresponding to this gene are useful for diagnosis and treatment of immune disorders. 
Since the gene is expressed in cells of lymphoid origin, the natural gene product may be 
involved in immune functions. Therefore it may be also used as an agent for 

35 immunological disorders including arthritis, asthma, immune deficiency diseases such 
as AIDS, and leukemia. 
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FEATURES OF PROTEIN ENCODED BY GENE NO: 203 

This gene is expressed primarily in Neutrophils. 

Therefore, polynucleotides and polypeptides of the invention are useful as 
5 reagents for differential identification of the tissue(s) or cell type(s) present in a 

biological sample and for diagnosis of diseases and conditions, which include, but are 
not limited to, infectious disorders, immune disorders, and cancers. Similarly, 
polypeptides and antibodies directed to these polypeptides are useful in providing 
immunological probes for differential identification of the tissue(s) or cell type(s). For 

10 a number of disorders of the above tissues or cells, particularly of the immune system, 
expression of this gene at significantly higher or lower levels may be routinely detected 
in certain tissues (e.g., cancerous and wounded tissues) or bodily fluids (e.g., serum, 
plasma, urine, synovial fluid or spinal fluid) or another tissue or cell sample taken from 
an individual having such a disorder, relative to the standard gene expression level, i.e., 

15 the expression level in healthy tissue or bodily fluid from an individual not having the 
disorder. Preferred epitopes include those comprising a sequence shown in SEQ ID 
NO: 436 as residues: Thr-31 to Lys-36. 

The tissue distribution indicates that polynucleotides and polypeptides 
corresponding to this gene are useful for diagnosis and treatment of infectious 

20 disorders, immune disorders, and cancers. Since the gene is expressed in cells of 
lymphoid origin, the natural gene product may be involved in immune functions. 
Therefore it may be also used as an agent for immunological disorders including 
arthritis, asthma, immune deficiency diseases such as AIDS, and leukemia. Protein, as 
well as, antibodies directed against the protein may show utility as a tumor marker 

25 and/or immunotherapy targets for the above listed tumors and tissues. 

FEATURES OF PROTEIN ENCODED BY GENE NO: 204 

This gene maps to chromosome 16 and therefore polynucleotides of the 
invention can be used in linkage analysis as markers for chromosome 16. The 
30 translation product of this gene shares sequence homology with lactate dehydrogenase 
which is thought to be important in lactate metabolism. 

This gene is expressed primarily in human tonsils and to a lesser extent in 
Spleen, and Neutrophils. 



35 reagents for differential identification of the tissue(s) or cell type(s) present in a 

biological sample and for diagnosis of diseases and conditions, which include, but are 
not limited to, immune disorders, infectious disorders, and cancers. Similarly. 



Therefore, polynucleotides and polypeptides of the invention are useful as 
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polypeptides and antibodies directed to these polypeptides are useful in providing 
immunological probes for differential identification of the tissue(s) or cell type(s). For 
a number of disorders of the above tissues or cells, particularly of the immune 
disorders, infectious disorders, and cancers, expression of this gene at significantly 
5 higher or lower levels may be routinely detected in certain tissues (e.g., cancerous and 
wounded tissues) or bodily fluids (e.g., serum, plasma, urine, synovial fluid or spinal 
fluid) or another tissue or cell sample taken from an individual having such a disorder, 
relative to the standard gene expression level, i.e., the expression level in healthy tissue 
or bodily fluid from an individual not having the disorder. Preferred epitopes include 
10 those comprising a sequence shown in SEQ ID NO: 437 as residues: Gly-7 to Ser-12. 

The tissue distribution and homology to lactate dehydrogenase gene indicates 
that polynucleotides and polypeptides corresponding to this gene are useful for 
diagnosis and treatment of immune disorders, infectious disorders, and cancers. 

15 FEATURES OF PROTEIN ENCODED BY GENE NO: 205 

The translation product of this gene shares sequence homology with Gcapl 
protein which is developmentally regulated in brain. 

This gene is expressed primarily in placenta and endometrial tumor and to a 
lesser extent in several other tumors. 

20 Therefore, polynucleotides and polypeptides of the invention are useful as 

reagents for differential identification of the tissue(s) or cell type(s) present in a 
biological sample and for diagnosis of diseases and conditions, which include, but are 
not limited to, vasculogenesis/angiogenesis and tumorigenesis. Similarly, polypeptides 
and antibodies directed to these polypeptides arc useful in providing immunological 

25 probes for differential identification of the tissue(s) or cell type(s). For a number of 
disorders of the above tissues or cells, particularly of the vascular system and tumors, 
expression of this gene at significantly higher or lower levels may be routinely detected 
in certain tissues (e.g., cancerous and wounded tissues) or bodily fluids (e.g., serum, 
plasma, urine, synovial fluid or spinal fluid) or another tissue or cell sample taken from 

30 an individual having such a disorder, relative to the standard gene expression level, i.e., 
the expression level in healthy tissue or bodily fluid from an individual not having the 
disorder. 

The tissue distribution and homology to Gcapl protein indicates that 
polynucleotides and polypeptides corresponding to this gene are useful for diagnosis 
35 and treatment of disorder or dysfunction of vascular system of tumorigenesis. 
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FEATURES OF PROTEIN ENCODED BY GENE NO: 206 

In specific embodiments, polypeptides of the invention comprise the sequence 
MPYAQWLAENDRFEEAQKAFHKAGRQREA (SEQ ID NO:799); 
VQVLEQLTNNAVAESRFNDAAYYYWMLSMQCLDIAQD (SEQ ID NO:794); 
5 PAQKDTMLGKFYHFQRLAELYHGYHAIHRHTEDP (SEQ ID NO: 795); 
FSVHRPETLFNISRFLLHSLPKDTPSGISKVKILFT (SEQ ID NO:800); 
LAKQSKALGAYRLARHAYDKLRGLYIP (SEQ ID NO:796); ARFQKSIELG 
TLTIRAKPFHDSEELVPLCYRCSTNN (SEQ ID NO: 797); and/or PLLNNLGNVC 
INCRQPFIFSASSYDVLHLVEFYLEEGITDEEAISLIDLEVLRPKRDDRQLEICKQQ 
10 LPDSCG (SEQ ID NO:798). Polynucleotides encoding these polypeptides are also 
encompassed by the invention. 

This gene is expressed primarily in testes. 

Therefore, polynucleotides and polypeptides of the invention are useful as 
reagents for differential identification of the tissue(s) or cell type(s) present in a 

15 biological sample and for diagnosis of diseases and conditions, which include, but are 
not limited to, male reproductive and endocrine disorders. Similarly, polypeptides and 
antibodies directed to these polypeptides are useful in providing immunological probes 
for differential identification of the tissue(s) or cell type(s). For a number of disorders 
of the above tissues or cells, particularly of the reproductive and endocrine systems, 

20 expression of this gene at significantly higher or lower levels may be routinely detected 
in certain tissues (e.g., cancerous and wounded tissues) or bodily fluids (e.g., serum, 
plasma, urine, synovial fluid or spinal fluid) or another tissue or cell sample taken from 
an individual having such a disorder, relative to the standard gene expression level, i.e., 
the expression level in healthy tissue or bodily fluid from an individual not having the 

25 disorder. 

The tissue distribution indicates that polynucleotides and polypeptides 
corresponding to this gene are useful for treatment of male reproductive and endocrine 
disorders. 

30 FEATURES OF PROTEIN ENCODED BY GENE NO: 207 

This gene is expressed in fetal lung. 

Therefore, polynucleotides and polypeptides of the invention are useful as 
reagents for differential identification of the tissue(s) or cell type(s) present in a 
biological sample and for diagnosis of diseases and conditions, which include, but are 
35 not limited to, lung diseases such as cystic fibrosis. Similarly, polypeptides and 

antibodies directed to these polypeptides are useful in providing immunological probes 
for differential identification of the tissue(s) or cell type(s). For a number of disorders 
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of the above tissues or cells, particularly of the respirator)' system, expression of this 
gene at significantly higher or lower levels may be routinely detected in certain tissues 
(e.g., cancerous and wounded tissues) or bodily fluids (e.g., serum, plasma, urine, 
synovial fluid or spinal fluid) or another tissue or cell sample taken from an individual 
5 having such a disorder, relative to the standard gene expression level, i.e., the 
expression level in healthy tissue or bodily fluid from an individual not having the 
disorder. Preferred epitopes include those comprising a sequence shown in SEQ ID 
NO: 440 as residues: Tyr-49 to Cys-54. 

The tissue distribution indicates that polynucleotides and polypeptides 

1 0 corresponding to this gene are useful for detection and treatment of disorders associated 
with developing lungs particularly in premature infants where the lungs are the last 
tissues to develop. The tissue distribution indicates that polynucleotides and 
polypeptides corresponding to this gene are useful for diagnosis and intervention of 
lung tumors since the gene may be involved in the regulation of cell division, 

15 particularly since it is expressed in fetal tissue. Protein, as well as, antibodies directed 
against the protein may show utility as a tumor marker and immunotherapy targets for 
the above listed tumors and tissues. 
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Tabic 1 summarizes the information corresponding to each "Gene No." described 
above. The nucleotide sequence identified as "NT SEQ ID NO:X" was assembled from 
partially homologous ("overlapping") sequences obtained from the "cDNA clone ID" 
identified in Table 1 and, in some cases, from additional related DNA clones. The 
5 overlapping sequences were assembled into a single contiguous sequence of high 
redundancy (usually three to five overlapping sequences at each nucleotide position), 
resulting in a final sequence identified as SEQ ID NO:X. 

The cDNA Clone ID was deposited on the date and given the corresponding 
deposit number listed in "ATCC Deposit No:Z and Date" Some of the deposits contain 
10 multiple different clones corresponding to the same gene. "Vector" refers to the type of 
vector contained in the cDNA Clone ID. 

"Total NT Seq " refers to the total number of nucleotides in the contig identified 
by "Gene No." The deposited clone may contain all or most of these sequences, 
reflected by the nucleotide position indicated as "5 1 NT of Clone Seq." and the "3' NT 
1 5 of Clone Seq." of SEQ ID NO:X. The nucleotide position of SEQ ID NO:X of the 
putative start codon (methionine) is identified as "5* NT of Start Codon." Similarly , 
the nucleotide position of SEQ ID NO:X of the predicted signal sequence is identified as 
"5' NT of First A A of Signal Pep." 

The translated amino acid sequence, beginning with the methionine, is identified 
20 as "AA SEQ ID NO: Y " although other reading frames can also be easily translated 
using known molecular biology techniques. The polypeptides produced by these 
alternative open reading frames are specifically contemplated by the present invention. 

The first and last amino acid position of SEQ ID NO: Y of the predicted signal 
peptide is identified as "First AA of Sig Pep" and "Last AA of Sig Pep." The predicted 
25 first amino acid position of SEQ ID NO: Y of the secreted portion is identified as 

"Predicted First AA of Secreted Portion." Finally, the amino acid position of SEQ ID 
NO: Y of the last amino acid in the open reading frame is identified as "Last AA of 
ORF." 

SEQ ID NO:X and the translated SEQ ID NO: Y are sufficiently accurate and 
30 otherwise suitable for a variety of uses well known in the art and described further 

below. For instance, SEQ ID NO:X is useful for designing nucleic acid hybridization 
probes that will detect nucleic acid sequences contained in SEQ ID NO:X or the cDNA 
contained in the deposited clone. These probes will also hybridize to nucleic acid 
molecules in biological samples, thereby enabling a variety of forensic and diagnostic 
35 methods of the invention. Similarly, polypeptides identified from SEQ ID NO:Y may 
be used to generate antibodies which bind specifically to the secreted proteins encoded 
by the cDN A clones identified in Table 1 . 



WO 98/54963 




m 



r CT/LS98/U422 



Nevertheless, DNA sequences generated by sequencing reactions can contain 
sequencing errors. The errors exist as misidentified nucleotides, or as insertions or 
deletions of nucleotides in the generated DNA sequence. The erroneously inserted or 
deleted nucleotides cause frame shifts in the reading frames of the predicted amino acid 
5 sequence. In these cases, the predicted amino acid sequence diverges from the actual 
amino acid sequence, even though the generated DNA sequence may be greater than 
99.97c identical to the actual DNA sequence (for example, one base insertion or deletion 
in an open reading frame of over 1000 bases). 



10 sequence or the amino acid sequence, the present invention provides not only the 

generated nucleotide sequence identified as SEQ ID NO:X and the predicted translated 
amino acid sequence identified as SEQ ID NO: Y, but also a sample of plasmid DNA 
containing a human cDNA of the invention deposited with the ATCC, as set forth in 
Table 1 . The nucleotide sequence of each deposited clone can readily be determined by 

15 sequencing the deposited clone in accordance with known methods. The predicted 
amino acid sequence can then be verified from such deposits. Moreover, the amino 
acid sequence of the protein encoded by a particular clone can also be directly 
determined by peptide sequencing or by expressing the protein in a suitable host cell 
containing the deposited human cDNA, collecting the protein, and determining its 

20 sequence. 

The present invention also relates to the genes corresponding to SEQ ID NO:X, 
SEQ ID NO:Y, or the deposited clone. The corresponding gene can be isolated in 
accordance with known methods using the sequence information disclosed herein. 
Such methods include preparing probes or primers from the disclosed sequence and 
25 identifying or amplifying the corresponding gene from appropriate sources of genomic 
material. 

Also provided in the present invention are species homologs. Species 
homologs may be isolated and identified by making suitable probes or primers from the 
sequences provided herein and screening a suitable nucleic acid source for the desired 
30 homologue. 

The polypeptides of the invention can be prepared in any suitable manner. Such 
polypeptides include isolated naturally occurring polypeptides, recombinantly produced 
polypeptides, synthetically produced polypeptides, or polypeptides produced by a 
combination of these methods. Means for preparing such polypeptides are well 
35 understood in the art. 

The polypeptides may be in the form of the secreted protein, including the 
mature form, or may be a part of a larger protein, such as a fusion protein (see below). 



Accordingly, for those applications requiring precision in the nucleotide 
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It is often advantageous to include an additional amino acid sequence which contains 
secretory or leader sequences, pro-sequences, sequences which aid in purification , 
such as multiple histidine residues, or an additional sequence for stability during 
recombinant production. 
5 The polypeptides of the present invention are preferably provided in an isolated 

form, and preferably are substantially purified. A recombinantly produced version of a 
polypeptide, including the secreted polypeptide, can be substantially purified by the 
one-step method described in Smith and Johnson, Gene 67:31-40 (1988). 
Polypeptides of the invention also can be purified from natural or recombinant sources 
10 using antibodies of the invention raised against the secreted protein in methods which 
are well known in the art. 

Signal Sequences 

Methods for predicting whether a protein has a signal sequence, as well as the 

1 5 cleavage point for that sequence, are available. For instance, the method of McGeoch, 
Virus Res. 3:271-286 (1985), uses the information from a short N-terminal charged 
region and a subsequent uncharged region of the complete (uncleaved) protein. The 
method of von Heinje, Nucleic Acids Res. 14:4683-4690 ( 1986) uses the information 
from the residues surrounding the cleavage site, typically residues -13 to +2, where +1 

20 indicates the amino terminus of the secreted protein. The accuracy of predicting the 

cleavage points of known mammalian secretory proteins for each of these methods is in 
the range of 75-80%. (von Heinje, supra.) However, the two methods do not always 
produce the same predicted cleavage point(s) for a given protein. 

In the present case, the deduced amino acid sequence of the secreted polypeptide 

25 was analyzed by a computer program called SignalP (Henrik Nielsen et al., Protein 
Engineering 10:1-6 (1997)), which predicts the cellular location of a protein based on 
the amino acid sequence. As part of this computational prediction of localization, the 
methods of McGeoch and von Heinje are incorporated. The analysis of the amino acid 
sequences of the secreted proteins described herein by this program provided the results 

30 shown in Table 1 . 

As one of ordinary skill would appreciate, however, cleavage sites sometimes 
vary from organism to organism and cannot be predicted with absolute certainty. 
Accordingly, the present invention provides secreted polypeptides having a sequence 
shown in SEQ ID NO: Y which have an N-terminus beginning within 5 residues (i.e., + 

35 or - 5 residues) of the predicted cleavage point. Similarly, it is also recognized that in 
some cases, cleavage of the signal sequence from a secreted protein is not entirely 
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uniform, resulting in more than one secreted species. These polypeptides, and the 
polynucleotides encoding such polypeptides, are contemplated by the present invention. 

Moreover, the signal sequence identified by the above analysis may not 
necessarily predict the naturally occurring signal sequence. For example, the naturally 
5 occurring signal sequence may be further upstream from the predicted signal sequence. 
However, it is likely that the predicted signal sequence will be capable of directing the 
secreted protein to the ER. These polypeptides, and the polynucleotides encoding such 
polypeptides, are contemplated by the present invention. 

10 Polynucleotide and Polypeptide Variants 

"Variant" refers to a polynucleotide or polypeptide differing from the 
polynucleotide or polypeptide of the present invention, but retaining essential properties 
thereof. Generally, variants are overall closely similar, and, in many regions, identical 
to the polynucleotide or polypeptide of the present invention. 

15 By a polynucleotide having a nucleotide sequence at least, for example, 95% 

"identical" to a reference nucleotide sequence of the present invention, it is intended that 
the nucleotide sequence of the polynucleotide is identical to the reference sequence 
except that the polynucleotide sequence may include up to five point mutations per each 
100 nucleotides of the reference nucleotide sequence encoding the polypeptide. In other 

20 words, to obtain a polynucleotide having a nucleotide sequence at least 95% identical to 
a reference nucleotide sequence, up to 5% of the nucleotides in the reference sequence 
may be deleted or substituted with another nucleotide, or a number of nucleotides up to 
5% of the total nucleotides in the reference sequence may be inserted into the reference 
sequence. The query sequence may be an entire sequence shown inTable 1, the ORF 

25 (open reading frame), or any fragement specified as described herein. 

As a practical matter, whether any particular nucleic acid molecule or 
polypeptide is at least 90%, 95%, 96%, 97%, 98% or 99% identical to a nucleotide 
sequence of the presence invention can be determined conventionally using known 
computer programs. A preferred method for deterrrung the best overall match between 

30 a query sequence (a sequence of the present invention) and a subject sequence, also 
referred to as a global sequence alignment, can be determined using the FASTDB 
computer program based on the algorithm of Brutlag et al. (Comp. App. Biosci. (1990) 
6:237-245). In a sequence alignment the query and subject sequences are both DNA 
sequences. An RNA sequence can be compared by converting U's to T's. The result 

35 of said global sequence alignment is in percent identity. Preferred parameters used in a 
FASTDB alignment of DNA sequences to calculate percent identiy are: 
Matrix=Unitary, k-tuple=4. Mismatch Penalty=l, Joining Penalty=30, Randomization 
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Group Length=0, Cutoff Score=l, Gap Penalty=5. Gap Size Penalty 0.05, Window 
Size=500 or the lenght of the subject nucleotide sequence, whichever is shorter. 

If the subject sequence is shorter than the query sequence because of 5' or 3' 
deletions, not because of internal deletions, a manual correction must be made to the 
5 results. This is becuase the FASTDB program does not account for 5' and 3' 
truncations of the subject sequence when calculating percent identity. For subject 
sequences truncated at the 5' or 3' ends, relative to the the query sequence, the percent 
identity is corrected by calculating the number of bases of the query sequence that are 5' 
and 3' of the subject sequence, which are not matched/aJigned, as a percent of the total 

10 bases of the query sequence. Whether a nucleotide is matched/aligned is determined by 
results of the FASTDB sequence alignment. This percentage is then subtracted from 
the percent identity, calculated by the above FASTDB program using the specified 
parameters, to arrive at a final percent identity score. This corrected score is what is 
used for the purposes of the present invention. Only bases outside the 5' and 3 J bases 

15 of the subject sequence, as displayed by the FASTDB alignment, which are not 

matched/aligned with the query sequence, are calculated for the purposes of manually 
adjusting the percent identity score. 

For example, a 90 base subject sequence is aligned to a 100 base query 
sequence to determine percent identity. The deletions occur at the 5 1 end of the subject 

20 sequence and therefore, the FASTDB alignment does not show a matched/alignement of 
the first 10 bases at 5' end. The 10 unpaired bases represent 10% of the sequence 
(number of bases at the 5' and 3' ends not matched/total number of bases in the query 
sequence) so 10% is subtracted from the percent identity score calculated by the 
FASTDB program. If the remaining 90 bases were perfectly matched the final percent 

25 identity would be 90%. In another example, a 90 base subject sequence is compared 
with a 100 base query sequence. This time the deletions are internal deletions so that 
there are no bases on the 5' or 3' of the subject sequence which are not matched/aligned 
with the query. In this case the percent identity calculated by FASTDB is not manually 
corrected. Once again, only bases 5' and 3' of the subject sequence which are not 

30 matched/aligned with the query sequnce are manually corrected for. No other manual 
corrections are to made for the purposes of the present invention. 

By a polypeptide having an amino acid sequence at least, for example, 95% 
"identical" to a query amino acid sequence of the present invention, it is intended that 
the amino acid sequence of the subject polypeptide is identical to the query sequence 

35 except that the subject polypeptide sequence may include up to five amino acid 

alterations per each 100 amino acids of the query amino acid sequence. In other words, 
to obtain a polypeptide having an amino acid sequence at least 95% identical to a query 
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amino acid sequence, up to 5% of the amino acid residues in the subject sequence may 
be inserted, deleted, (indels) or substituted with another amino acid. These alterations 
of the reference sequence may occur at the amino or carboxy terminal positions of the 
reference amino acid sequence or anywhere between those terminal positions, 
5 interspersed either individually among residues in the reference sequence or in one or 
more contiguous groups within the reference sequence. 

As a practical matter, whether any particular polypeptide is at least 90%, 957c, 
96%, 97%, 98% or 99% identical to, for instance, the amino acid sequences shown in 
Table 1 or to the amino acid sequence encoded by deposited DNA clone can be 

10 determined conventionally using known computer programs. A preferred method for 
determing the best overall match between a query sequence (a sequence of the present 
invention) and a subject sequence, also referred to as a global sequence alignment, can 
be determined using the FASTDB computer program based on the algorithm of Brutlag 
et al. (Comp. App. Biosci. (1990) 6:237-245). In a sequence alignment the query and 

15 subject sequences are either both nucleotide sequences or both amino acid sequences. 
The result of said global sequence alignment is in percent identity. Preferred parameters 
used in a FASTDB amino acid alignment are: Matrix=PAM 0, k-tuple=2, Mismatch 
Penalty=l, Joining Penalty=20, Randomization Group Length=0, Cutoff Score=l , 
Window Size=sequence length, Gap Penalty=5, Gap Size Penalty=0.05, Window 

20 Size=500 or the length of the subject amino acid sequence, whichever is shorter. 

If the subject sequence is shorter than the query sequence due to N- or C- 
terminal deletions, not because of internal deletions, a manual correction must be made 
to the results. This is becuase the FASTDB program does not account for N- and C- 
terminal truncations of the subject sequence when calculating global percent identity. 

25 For subject sequences truncated at the N- and C-termini, relative to the the query 

sequence, the percent identity is corrected by calculating the number of residues of the 
query sequence that are N- and C-terminal of the subject sequence, which are not 
matched/aligned with a corresponding subject residue, as a percent of the total bases of 
the query sequence. Whether a residue is matched/aligned is determined by results of 

30 the FASTDB sequence alignment. This percentage is then subtracted from the percent 
identity, calculated by the above FASTDB program using the specified parameters, to 
arrive at a final percent identity score. This final percent identity score is what is used 
for the purposes of the present invention. Only residues to the N- and C-termini of the 
subject sequence, which are not matched/aligned with the query sequence, are 

35 considered for the purposes of manually adjusting the percent identity score. That is, 
only query residue positions outside the farthest N- and C-terminal residues of the 
subject sequence. 
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For example, a 90 amino acid residue subject sequence is aligned with a 100 
residue query sequence to determine percent identity. The deletion occurs at the N- 
terminus of the subject sequence and therefore, the FASTDB alignment does not show 
a matching/alignment of the first 10 residues at the N-terminus. The 10 unpaired 
5 residues represent 10% of the sequence (number of residues at the N- and C- termini 
not matched/total number of residues in the query sequence) so 10% is subtracted from 
the percent identity score calculated by the FASTDB program. If the remaining 90 
residues were perfectly matched the final percent identity would be 90%. In another 
example, a 90 residue subject sequence is compared with a 100 residue query sequence. 

10 This time the deletions are internal deletions so there are no residues at the N- or C- 
termini of the subject sequence which are not matched/aligned with the query. In this 
case the percent identity calculated by FASTDB is not manually corrected. Once again, 
only residue positions outside the N- and C-terminal ends of the subject sequence, as 
displayed in the FASTDB alignment, which are not matched/aligned with the query 

15 sequnce are manually corrected for. No other manual corrections are to made for the 
purposes of the present invention. 

The variants may contain alterations in the coding regions, non-coding regions, 
or both. Especially preferred are polynucleotide variants containing alterations which 
produce silent substitutions, additions, or deletions, but do not alter the properties or 

20 activities of the encoded polypeptide. Nucleotide variants produced by silent 

substitutions due to the degeneracy of the genetic code are preferred. Moreover, 
variants in which 5-10, 1-5, or 1-2 amino acids are substituted, deleted, or added in any 
combination are also preferred. Polynucleotide variants can be produced for a variety 
of reasons, e.g., to optimize codon expression for a particular host (change codons in 

25 the human mRNA to those preferred by a bacterial host such as E. coli). 

Naturally occurring variants are called "allelic variants," and refer to one of 
several alternate forms of a gene occupying a given locus on a chromosome of an 
organism. (Genes II, Lewin, B., ed., John Wiley & Sons, New York (1985).) These 
allelic variants can vary at either the polynucleotide and/or polypeptide level. 

30 Alternatively, non-naturally occurring variants may be produced by mutagenesis 
techniques or by direct synthesis. 

Using known methods of protein engineering and recombinant DNA 
technology, variants may be generated to improve or alter the characteristics of the 
polypeptides of the present invention. For instance, one or more amino acids can be 

35 deleted from the N-terminus or C-terminus of the secreted protein without substantial 
loss of biological function. The authors of Ron et aL, J. Biol. Chem. 268: 2984-2988 
(1993). reported variant KGF proteins having heparin binding activity even after 
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deleting 3, 8, or 27 amino-terminal amino acid residues. Similarly, Interferon gamma 
exhibited up to ten times higher activity after deleting 8-10 amino acid residues from the 
carboxy terminus of this protein. (Dobeli et aL J. Biotechnology 7:199-216 (1988).) 
Moreover, ample evidence demonstrates that variants often retain a biological 
5 activity similar to that of the naturally occurring protein. For example, Gayle and 
coworkers (J. Biol. Chem 268:22105-221 1 1 (1993)) conducted extensive mutational 
analysis of human cytokine IL-la. They used random mutagenesis to generate over 
3,500 individual IL-la mutants that averaged 2.5 amino acid changes per variant over 
the entire length of the molecule. Multiple mutations were examined at every possible 
10 amino acid position. The investigators found that "[m]ost of the molecule could be 
altered with little effect on either [binding or biological activity]." (See, Abstract.) In 
fact, only 23 unique amino acid sequences, out of more than 3,500 nucleotide 
sequences examined, produced a protein that significantly differed in activity from wild- 
type. 

15 Furthermore, even if deleting one or more amino acids from the N-terminus or 

C-terminus of a polypeptide results in modification or loss of one or more biological 
functions, other biological activities may still be retained. For example, the ability of a 
deletion variant to induce and/or to bind antibodies which recognize the secreted form 
will likely be retained when less than the majority of the residues of the secreted form 

20 are removed from the N-terminus or C-terminus. Whether a particular polypeptide 

lacking N- or C-terminal residues of a protein retains such immunogenic activities can 
readily be determined by routine methods described herein and otherwise known in the 
art. 

Thus, the invention further includes polypeptide variants which show 
25 substantial biological activity. Such variants include deletions, insertions, inversions, 
repeats, and substitutions selected according to general rules known in the art so as 
have little effect on activity. For example, guidance concerning how to make 
phenotypically silent amino acid substitutions is provided in Bowie, J. U. et al., 
Science 247:1306-1310 (1990), wherein the authors indicate that there are two main 
30 strategies for studying the tolerance of an amino acid sequence to change. 

The first strategy exploits the tolerance of amino acid substitutions by natural 
selection during the process of evolution. By comparing amino acid sequences in 
different species, conserved amino acids can be identified. These conserved amino 
acids are likely important for protein function. In contrast, the amino acid positions 
35 where substitutions have been tolerated by natural selection indicates that these 

positions are not critical for protein function. Thus, positions tolerating amino acid 
substitution could be modified while still maintaining biological activity of the protein. 
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The second strategy uses genetic engineering to introduce amino acid changes at 
specific positions of a cloned gene to identify regions critical for protein function. For 
example, site directed mutagenesis or alanine-scanning mutagenesis (introduction of 
single alanine mutations at every residue in the molecule) can be used. (Cunningham 
5 and Wells, Science 244: 1081-1085 ( 1989).) The resulting mutant molecules can then 
be tested for biological activity. 

As the authors state, these two strategies have revealed that proteins are 
surprisingly tolerant of amino acid substitutions. The authors further indicate which 
amino acid changes are likely to be permissive at certain amino acid positions in the 

10 protein. For example, most buried (within the tertiary structure of the protein) amino 
acid residues require nonpolar side chains, whereas few features of surface side chains 
are generally conserved. Moreover, tolerated conservative amino acid substitutions 
involve replacement of the aliphatic or hydrophobic amino acids Ala, Val, Leu and lie; 
replacement of the hydroxy! residues Ser and Thr; replacement of the acidic residues 

1 5 Asp and Glu; replacement of the amide residues Asn and Gin, replacement of the basic 
residues Lys, Arg, and His; replacement of the aromatic residues Phe, Tyr, and Trp, 
and replacement of the small-sized amino acids Ala, Ser, Thr, Met, and Gly. 

Besides conservative amino acid substitution, variants of the present invention 
include (i) substitutions with one or more of the non-conserved amino acid residues, 

20 where the substituted amino acid residues may or may not be one encoded by the 
genetic code, or (ii) substitution with one or more of amino acid residues having a 
substituent group, or (iii) fusion of the mature polypeptide with another compound, 
such as a compound to increase the stability and/or solubility of the polypeptide (for 
example, polyethylene glycol), or (iv) fusion of the polypeptide with additional amino 

25 acids, such as an IgG Fc fusion region peptide, or leader or secretory sequence, or a 
sequence facilitating purification. Such variant polypeptides are deemed to be within 
the scope of those skilled in the art from the teachings herein. 

For example, polypeptide variants containing amino acid substitutions of 
charged amino acids with other charged or neutral amino acids may produce proteins 

30 with improved characteristics, such as less aggregation. Aggregation of pharmaceutical 
formulations both reduces activity and increases clearance due to the aggregate's 
immunogenic activity. (Pinckard et al., Clin. Exp. Immunol. 2:331-340 (1967); 
Robbins et al., Diabetes 36: 838-845 (1987); Cleland et al., Crit. Rev. Therapeutic 
Drug Carrier Systems 10:307-377 (1993).) 



35 
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Polynucleotide and Polypeptide Fragments 

In the present invention, a "polynucleotide fragment" refers to a short 
polynucleotide having a nucleic acid sequence contained in the deposited clone or 
shown in SEQ ID NO:X. The short nucleotide fragments are preferably at least about 
5 15 nt, and more preferably at least about 20 nt, still more preferably at least about 30 nt, 
and even more preferably, at least about 40 nt in length. A fragment "at Feast 20 nt in 
length," for example, is intended to include 20 or more contiguous bases from the 
cDNA sequence contained in the deposited clone or the nucleotide sequence shown in 
SEQ ID NO:X. These nucleotide fragments are useful as diagnostic probes and primers 

10 as discussed herein. Of course, larger fragments (e.g., 50, 150, 500, 600, 2000 
nucleotides) arc preferred. 

Moreover, representative examples of polynucleotide fragments of the 
invention, include, for example, fragments having a sequence from about nucleotide 
number 1-50, 51-100, 101-150, 151-200, 201-250, 251-300, 301-350, 351-400, 401- 

15 450, 451-500, 501-550, 551-600, 651-700, 701-750, 751-800, 800-850, 851-900, 
901-950, 951-1000, 1001-1050, 1051-1100, 1101-1150, 1151-1200, 1201-1250, 
1251-1300, 1301-1350, 1351-1400, 1401-1450, 1451-1500, 1501-1550, 1551-1600, 
1601-1650, 1651-1700, 1701-1750, 1751-1800, 1801-1850, 1851-1900, 1901-1950, 
1951-2000, or 2001 to the end of SEQ ID NO:X or the cDNA contained in the 

20 deposited clone. In this context "about" includes the particularly recited ranges, larger 
or smaller by several (5, 4, 3, 2, or 1 ) nucleotides, at either terminus or at both termini. 
Preferably, these fragments encode a polypeptide which has biological activity. More 
preferably, these polynucleotides can be used as probes or primers as discussed herein. 
In the present invention, a "polypeptide fragment" refers to a short amino acid 

25 sequence contained in SEQ ID NO: Y or encoded by the cDNA contained in the 

deposited clone. Protein fragments may be "free-standing," or comprised within a 
larger polypeptide of which the fragment forms a part or region, most preferably as a 
single continuous region. Representative examples of polypeptide fragments of the 
invention, include, for example, fragments from about amino acid number 1-20, 21-40, 

30 41-60,61-80,81-100, 102-120, 121-140, 141-160, or 161 to the end of the coding 
region. Moreover, polypeptide fragments can be about 20, 30, 40, 50, 60, 70, 80, 90, 
100, 1 10, 120, 130, 140, or 150 amino acids in length. In this context "about" 
includes the particularly recited ranges, larger or smaller by several (5, 4, 3, 2, or 1 ) 
amino acids, at either extreme or at both extremes. 

35 Preferred polypeptide fragments include the secreted protein as well as the 

mature form. Further preferred polypeptide fragments include the secreted protein or 
the mature form having a continuous series of deleted residues from the amino or the 
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carboxy terminus, or both. For example, any number of amino acids, ranging from 1- 
60, can be deleted from the amino terminus of either the secreted polypeptide or the 
mature form. Similarly, any number of amino acids, ranging from 1-30, can be deleted 
from the carboxy terminus of the secreted protein or mature form. Furthermore, any 
5 combination of the above amino and carboxy terminus deletions are preferred. 

Similarly, polynucleotide fragments encoding these polypeptide fragments are also 
preferred. 

Particularly, N-terminal deletions of the polypeptide of the present invention can 
be described by the general formula m-p, where p is the total number of amino acids in 
10 the polypeptide and m is an integer from 2 to (p- 1 ), and where both of these integers (m 
& p) correspond to the position of the amino acid residue identified in SEQ ID NO: Y. 

Moreover, C-terminal deletions of the polypeptide of the present invention can 
also be described by the general formula 1-n, where n is an integer from 2 to (p-I), and 
again where these integers (n & p) correspond to the position of the ammo acid residue 
1 5 identified in SEQ ID NO: Y. 

The invention also provides polypeptides having one or more amino acids 
deleted from both the amino and the carboxyl termini, which may be described 
generally as having residues m-n of SEQ ID NO: Y, where m and n are integers as 
described above. 

20 Also preferred are polypeptide and polynucleotide fragments characterized by 

structural or functional domains, such as fragments that comprise alpha-helix and alpha- 
helix forming regions, beta-sheet and beta-sheet-forming regions, turn and turn- 
forming regions, coil and coil-forming regions, hydrophilic regions, hydrophobic 
regions, alpha amphipathic regions, beta amphipathic regions, flexible regions, surface - 

25 forming regions, substrate binding region, and high antigenic index regions. 
Polypeptide fragments of SEQ ID NO: Y falling within conserved domains are 
specifically contemplated by the present invention. Moreover, polynucleotide 
fragments encoding these domains are also contemplated. 

Other preferred fragments are biologically active fragments. Biologically active 

30 fragments are those exhibiting activity similar, but not necessarily identical, to an 
activity of the polypeptide of the present invention. The biological activity of the 
fragments may include an improved desired activity, or a decreased undesirable activity. 

Epitopes & Antibodies 

35 In the present invention, "epitopes" refer to polypeptide fragments having 

antigenic or immunogenic activity in an animal, especially in a human. A preferred 
embodiment of the present invention relates to a polypeptide fragment comprising an 
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epitope, as well as the polynucleotide encoding this fragment. A region of a protein 
molecule to which an antibody can bind is defined as an "antigenic epitope." In 
contrast, an "immunogenic epitope" is defined as a part of a protein that elicits an 
antibody response. (See, for instance, Geysen et ah, Proc. Natl. Acad. Sci. USA 
5 81:3998- 4002 (1983).) 

Fragments which function as epitopes may be produced by any conventional 
means. (See, e.g., Houghtcn, R. A., Proc. Natl. Acad. Sci. USA 82:5131-5135 
(1985) further described in U.S. Patent No. 4,631,21 1.) 

In the present invention, antigenic epitopes preferably contain a sequence of at 
10 least seven, more preferably at least nine, and most preferably between about 15 to 
about 30 amino acids. Antigenic epitopes are useful to raise antibodies, including 
monoclonal antibodies, that specifically bind the epitope. (See, for instance, Wilson et 
al.. Cell 37:767-778 (1984); Sutcliffe, J. G. et al., Science 219:660-666 (1983).) 

Similarly, immunogenic epitopes can be used to induce antibodies according to 
15 methods well known in the art. (See, for instance, Sutcliffc et al., supra; Wilson et al., 
supra; Chow, M. et al., Proc. Natl. Acad. Sci. USA 82:910-914; and Bittle, F. J. et 
al., J. Gen. Virol. 66:2347-2354 (1985).) A preferred immunogenic epitope includes 
the secreted protein. The immunogenic epitopes may be presented together with a 
carrier protein, such as an albumin, to an animal system (such as rabbit or mouse) or, if 
20 it is long enough (at least about 25 amino acids), without a carrier. However, 

immunogenic epitopes comprising as few as 8 to 10 amino acids have been shown to be 
sufficient to raise antibodies capable of binding to, at the very least, linear epitopes in a 
denatured polypeptide (e.g., in Western blotting.) 

As used herein, the term "antibody" (Ab) or "monoclonal antibody" (Mab) is 
25 meant to include intact molecules as well as antibody fragments (such as, for example, 
Fab and F(ab')2 fragments) which are capable of specifically binding to protein. Fab 
and F(ab')2 fragments lack the Fc fragment of intact antibody, clear more rapidly from 
the circulation, and may have less non-specific tissue binding than an intact antibody. 
(Wahl et al., J. Nucl. Med. 24:316-325 (1983).) Thus, these fragments are preferred, 
30 as well as the products of a FAB or other immunoglobulin expression library. 

Moreover, antibodies of the present invention include chimeric, single chain, and 
humanized antibodies. 



Fusion Proteins 

35 Any polypeptide of the present invention can be used to generate fusion 

proteins. For example, the polypeptide of the present invention, when fused to a 
second protein, can be used as an antigenic tag. Antibodies raised against the 
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polypeptide of the present invention can be used to indirectly detect the second protein 
by binding to the polypeptide. Moreover, because secreted proteins target cellular 
locations based on trafficking signals, the polypeptides of the present invention can be 
used as targeting molecules once fused to other proteins. 
5 Examples of domains that can be fused to polypeptides of the present invention 

include not only heterologous signal sequences, but also other heterologous functional 
regions. The fusion does not necessarily need to be direct, but may occur through 
linker sequences. 

Moreover, fusion proteins may also be engineered to improve characteristics of 

10 the polypeptide of the present invention. For instance, a region of additional amino 
acids, particularly charged amino acids, may be added to the N-terminus of the 
polypeptide to improve stability and persistence during purification from the host cell or 
subsequent handling and storage. Also, peptide moieties may be added to the 
polypeptide to facilitate purification. Such regions may be removed prior to final 

15 preparation of the polypeptide. The addition of peptide moieties to facilitate handling of 
polypeptides are familiar and routine techniques in the art. 

Moreover, polypeptides of the present invention, including fragments, and 
specifically epitopes, can be combined with parts of the constant domain of 
immunoglobulins (IgG), resulting in chimeric polypeptides. These fusion proteins 

20 facilitate purification and show an increased half-life in vivo. One reported example 
describes chimeric proteins consisting of the first two domains of the human CD4- 
polypeptide and various domains of the constant regions of the heavy or light chains of 
mammalian immunoglobulins. (EP A 394,827; Traunecker et ah, Nature 33 1 :84-86 
(1988).) Fusion proteins having disulfide-linked dimeric structures (due to the IgG) 

25 can also be more efficient in binding and neutralizing other molecules, than the 
monomeric secreted protein or protein fragment alone. (Fountoulakis et al., J. 
Biochem. 270:3958-3964 (1995).) 

Similarly, EP-A-O 464 533 (Canadian counterpart 2045869) discloses fusion 
proteins comprising various portions of constant region of immunoglobulin molecules 

30 together with another human protein or part thereof. In many cases, the Fc part in a 
fusion protein is beneficial in therapy and diagnosis, and thus can result in, for 
example, improved pharmacokinetic properties. (EP-A 0232 262.) Alternatively, 
deleting the Fc part after the fusion protein has been expressed, detected, and purified, 
would be desired. For example, the Fc portion may hinder therapy and diagnosis if the 

35 fusion protein is used as an antigen for immunizations. In drug discovery, for 

example, human proteins, such as hIL-5, have been fused with Fc portions for the 
purpose of high-throughput screening assays to identify antagonists of hIL-5. (See, D. 
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Bennett et al.. J. Molecular Recognition 8:52-58 (1995); K. Johanson et al.. J. Biol. 
Chem. 270:9459-9471 (1995).) 

Moreover, the polypeptides of the present invention can be fused to marker 
sequences, such as a peptide which facilitates purification of the fused polypeptide. In 
5 preferred embodiments, the marker amino acid sequence is a hexa-histidine peptide, 
such as the tag provided in a pQE vector (Q1AGEN, Inc., 9259 Eton Avenue, 
Chatsworth, CA. 91311), among others, many of which are commercially available. 
As described in Gentz et al., Proc. Natl. Acad. Sci. USA 86:821-824 (1989), for 
instance, hexa-histidine provides for convenient purification of the fusion protein. 
10 Another peptide tag useful for purification, the "HA" tag, corresponds to an epitope 

derived from the influenza hemagglutinin protein. (Wilson et al., Cell 37:767 (1984).) 

Thus, any of these above fusions can be engineered using the polynucleotides 
or the polypeptides of the present invention. 

15 Vectors. Host Cells, and Protein Production 

The present invention also relates to vectors containing the polynucleotide of the 
present invention, host cells, and the production of polypeptides by recombinant 
techniques. The vector may be, for example, a phage, plasmid, viral, or retroviral 
vector. Retroviral vectors may be replication competent or replication defective. In the 

20 latter case, viral propagation generally will occur only in complementing host cells. 

The polynucleotides may be joined to a vector containing a selectable marker for 
propagation in a host. Generally, a plasmid vector is introduced in a precipitate, such 
as a calcium phosphate precipitate, or in a complex with a charged lipid. If the vector is 
a virus, it may be packaged in vitro using an appropriate packaging cell line and then 

25 transduced into host cells. 

The polynucleotide insert should be operatively linked to an appropriate 
promoter, such as the phage lambda PL promoter, the E. coli lac, trp, phoA and tac 
promoters, the SV40 early and late promoters and promoters of retroviral LTRs, to 
name a few. Other suitable promoters will be known to the skilled artisan. The 

30 expression constructs will further contain sites for transcription initiation, termination, 
and, in the transcribed region, a ribosome binding site for translation. The coding 
portion of the transcripts expressed by the constructs will preferably include a 
translation initiating codon at the beginning and a termination codon (UAA, UGA or 
UAG) appropriately positioned at the end of the polypeptide to be translated. 

35 As indicated, the expression vectors will preferably include at least one 

selectable marker. Such markers include dihydrofolate reductase, G418 or neomycin 
resistance for eukaryotic cell culture and tetracycline, kanamycin or ampicillin resistance 
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genes for culturing in E. coli and other bacteria. Representative examples of 
appropriate hosts include, but are not limited to. bacterial cells, such as E. coli, 
Streptomyces and Salmonella typhimurium cells; fungal cells, such as yeast cells; insect 
cells such as Drosophila S2 and Spodoptera Sf9 cells; animal cells such as CHO, COS, 
5 293, and Bowes melanoma cells; and plant cells. Appropriate culture mediums and 
conditions for the above-described host cells are known in the art. 

Among vectors preferred for use in bacteria include pQE70, pQE60 and pQE-9, 
available from QIAGEN, Inc.; pBluescnpt vectors, Phagescript vectors, pNH8A, 
pNH16a, pNH18A, pNH46A, available from Stratagene Cloning Systems, Inc.; and 

10 ptrc99a, pKK223-3, pKK233-3, pDR540, pRIT5 available from Pharmacia Biotech, 
Inc. Among preferred eukaryotic vectors are pWLNEO, pSV2CAT, pOG44, pXTI 
and pSG available from Stratagene; and pS VK3, pBPV, pMSG and pSVL available 
from Pharmacia. Other suitable vectors will be readily apparent to the skilled artisan. 
Introduction of the construct into the host cell can be effected by calcium 

15 phosphate transfection, DEAE-dextran mediated transfection, cationic lipid-mediated 
transfection, electroporation, transduction, infection, or other methods. Such methods 
are described in many standard laboratory manuals, such as Davis et al., Basic Methods 
In Molecular Biology (1986). It is specifically contemplated that the polypeptides of the 
present invention may in fact be expressed by a host cell lacking a recombinant vector. 

20 A polypeptide of this invention can be recovered and purified from recombinant 

cell cultures by well-known methods including ammonium sulfate or ethanol 
precipitation, acid extraction, anion or cation exchange chromatography, 
phosphocellulose chromatography, hydrophobic interaction chromatography, affinity 
chromatography, hydroxylapatite chromatography and lectin chromatography. Most 

25 preferably, high performance liquid chromatography ( HPLC") is employed for 
purification. 

Polypeptides of the present invention, and preferably the secreted form, can also 
be recovered from: products purified from natural sources, including bodily fluids, 
tissues and cells, whether directly isolated or cultured; products of chemical synthetic 

30 procedures; and products produced by recombinant techniques from a prokaryotic or 
eukaryotic host, including, for example, bacterial, yeast, higher plant, insect, and 
mammalian cells. Depending upon the host employed in a recombinant production 
procedure, the polypeptides of the present invention may be glycosylated or may be 
non-glycosylated. In addition, polypeptides of the invention may also include an initial 

35 modified methionine residue, in some cases as a result of host-mediated processes. 
Thus, it is well known in the art that the N-terminal methionine encoded by the 
translation initiation codon generally is removed with high efficiency from any protein 
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after translation in all eukaryotic ceils. While the N-ternnnai methionine on most 
proteins also is efficiently removed in most prokaryotes, for some proteins, this 
prokaryotic removal process is inefficient, depending on the nature of the amino acid to 
which the N-terminal methionine is covalently linked. 

5 

Uses of the Polynucleotides 

Each of the polynucleotides identified herein can be used in numerous ways as 
reagents. The following description should be considered exemplary and utilizes 
known techniques. 

10 The polynucleotides of the present invention are useful for chromosome 

identification. There exists an ongoing need to identify new chromosome markers, 
since few chromosome marking reagents, based on actual sequence data (repeat 
polymorphisms), are presently available. Each polynucleotide of the present invention 
can be used as a chromosome marker. 

15 Briefly, sequences can be mapped to chromosomes by preparing PCR primers 

(preferably 15-25 bp) from the sequences shown in SEQ ID NO:X. Primers can be 
selected using computer analysis so that primers do not span more than one predicted 
exon in the genomic DNA. These primers are then used for PCR screening of somatic 
cell hybrids containing individual human chromosomes. Only those hybrids containing 

20 the human gene corresponding to the SEQ ID NO:X will yield an amplified fragment. 
Similarly, somatic hybrids provide a rapid method of PCR mapping the 
polynucleotides to particular chromosomes. Three or more clones can be assigned per 
day using a single thermal cycler. Moreover, sublocalization of the polynucleotides can 
be achieved with panels of specific chromosome fragments. Other gene mapping 

25 strategies that can be used include in situ hybridization, prescreening with labeled flow- 
sorted chromosomes, and preselection by hybridization to construct chromosome 
specific-cDNA libraries. 

Precise chromosomal location of the polynucleotides can also be achieved using 
fluorescence in situ hybridization (FISH) of a metaphase chromosomal spread. This 

30 technique uses polynucleotides as short as 500 or 600 bases; however, polynucleotides 
2,000-4,000 bp are preferred. For a review of this technique, see Verma et al., 
"Human Chromosomes: a Manual of Basic Techniques/' Pergamon Press, New York 
(1988). 

For chromosome mapping, the polynucleotides can be used individually (to 
35 mark a single chromosome or a single site on that chromosome) or in panels (for 
marking multiple sites and/or multiple chromosomes). Preferred polynucleotides 
correspond to the noncoding regions of the cDNAs because the coding sequences are 
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more likely conserved within gene families, thus increasing the chance of cross 
hybridization during chromosomal mapping. 

Once a polynucleotide has been mapped to a precise chromosomal location, the 
physical position of the polynucleotide can be used in linkage analysis. Linkage 
5 analysis establishes coinheritance between a chromosomal location and presentation of a 
particular disease. (Disease mapping data are found, for example, in V. McKusick, 
Mendelian Inheritance in Man (available on line through Johns Hopkins University 
Welch Medical Library) .) Assuming 1 megabase mapping resolution and one gene per 
20 kb, a cDNA precisely localized to a chromosomal region associated with the disease 

1 0 could be one of 50-500 potential causative genes. 

Thus, once coinheritance is established, differences in the polynucleotide and 
the corresponding gene between affected and unaffected individuals can be examined. 
First, visible structural alterations in the chromosomes, such as deletions or 
translocations, are examined in chromosome spreads or by PCR. If no structural 

15 alterations exist, the presence of point mutations are ascertained. Mutations observed in 
some or all affected individuals, but not in norma] individuals, indicates that the 
mutation may cause the disease. However, complete sequencing of the polypeptide and 
the corresponding gene from several normal individuals is required to distinguish the 
mutation from a polymorphism. If a new polymorphism is identified, this polymorphic 

20 polypeptide can be used for further linkage analysis. 

Furthermore, increased or decreased expression of the gene in affected 
individuals as compared to unaffected individuals can be assessed using 
polynucleotides of the present invention. Any of these alterations (altered expression, 
chromosomal rearrangement, or mutation) can be used as a diagnostic or prognostic 

25 marker. 

In addition to the foregoing, a polynucleotide can be used to control gene 
expression through triple helix formation or antisense DNA or RNA. Both methods 
rely on binding of the polynucleotide to DNA or RNA. For these techniques, preferred 
polynucleotides are usually 20 to 40 bases in length and complementary to either the 

30 region of the gene involved in transcription (triple helix - see Lee et al., Nucl. Acids 

Res. 6:3073 (1979); Cooney et al., Science 241:456 (1988); and Dervan et al., Science 
251:1360 (1991) ) or to the mRNA itself (antisense - Okano, J. Neurochem. 56:560 
(1991); Oligodeoxy-nucleotides as Antisense Inhibitors of Gene Expression, CRC 
Press, Boca Raton, FL (1988).) Triple helix formation optimally results in a shut-off 

35 of RNA transcription from DNA, while antisense RNA hybridization blocks translation 
of an mRNA molecule into polypeptide. Both techniques are effective in model 
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systems, and the information disclosed herein can be used to design antisense or triple 
helix polynucleotides in an effort to treat disease. 

Polynucleotides of the present invention are also useful in gene therapy. One 
goal of gene therapy is to insert a normal gene into an organism having a defective 
5 gene, in an effort to correct the genetic defect. The polynucleotides disclosed in the 
present invention offer a means of targeting such genetic defects in a highly accurate 
manner. Another goal is to insert a new gene that was not present in the host genome, 
thereby producing a new trait in the host cell. 

The polynucleotides are also useful for identifying individuals from minute 

10 biological samples. The United States military, for example, is considering the use of 
restriction fragment length polymorphism (RFLP) for identification of its personnel. In 
this technique, an individual's genomic DNA is digested with one or more restriction 
enzymes, and probed on a Southern blot to yield unique bands for identifying 
personnel. This method does not suffer from the current limitations of "Dog Tags" 

15 which can be lost, switched, or stolen, making positive identification difficult. The 
polynucleotides of the present invention can be used as additional DNA markers for 
RFLP. 

The polynucleotides of the present invention can also be used as an alternative to 
RFLP, by determining the actual base-by-base DNA sequence of selected portions of an 

20 individual's genome. These sequences can be used to prepare PCR primers for 

amplifying and isolating such selected DNA, which can then be sequenced. Using this 
technique, individuals can be identified because each individual will have a unique set 
of DNA sequences. Once an unique ID database is established for an individual, 
positive identification of that individual, living or dead, can be made from extremely 

25 small tissue samples. 

Forensic biology also benefits from using DNA-based identification techniques 
as disclosed herein. DNA sequences taken from very small biological samples such as 
tissues, e.g., hair or skin, or body fluids, e.g., blood, saliva, semen, etc., can be 
amplified using PCR. In one prior art technique, gene sequences amplified from 

30 polymorphic loci, such as DQa class II HLA gene, are used in forensic biology to 

identify individuals. (Erlich, H., PCR Technology, Freeman and Co. (1992).) Once 
these specific polymorphic loci are amplified, they are digested with one or more 
restriction enzymes, yielding an identifying set of bands on a Southern blot probed with 
DNA corresponding to the DQa class II HLA gene. Similarly, polynucleotides of the 

35 present invention can be used as polymorphic markers for forensic purposes. 

There is also a need for reagents capable of identifying the source of a particular 
tissue. Such need arises, for example, in forensics when presented with tissue of 
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unknown origin. Appropriate reagents can comprise, for example, DNA probes or 
primers specific to particular tissue prepared from the sequences of the present 
invention. Panels of such reagents can identify tissue by species and/or by organ type. 
In a similar fashion, these reagents can be used to screen tissue cultures for 
5 contamination. 

In the very least, the polynucleotides of the present invention can be used as 
molecular weight markers on Southern gels, as diagnostic probes for the presence of a 
specific mRNA in a particular cell type, as a probe to "subtract-out" known sequences 
in the process of discovering novel polynucleotides, for selecting and making oligomers 
10 for attachment to a "gene chip" or other support, to raise anti-DNA antibodies using 
DNA immunization techniques, and as an antigen to elicit an immune response. 

Uses of the Polypeptides 

Each of the polypeptides identified herein can be used in numerous ways. The 
15 following description should be considered exemplary and utilizes known techniques. 

A polypeptide of the present invention can be used to assay protein levels in a 
biological sample using antibody-based techniques. For example, protein expression in 
tissues can be studied with classical immunohistological methods. (Jalkanen, M, et 
al., J. Cell. Biol. 101:976-985 (1985); Jalkanen, M, et al., J. Cell . Biol. 105:3087- 
20 3096 (1987).) Other antibody-based methods useful for detecting protein gene 

expression include immunoassays, such as the enzyme linked immunosorbent assay 
(ELISA) and the radioimmunoassay (RIA). Suitable antibody assay labels are known 
in the art and include enzyme labels, such as, glucose oxidase, and radioisotopes, such 
as iodine (1251, 1211), carbon (14C), sulfur (35S), tritium (3H), indium (I12In), and 
25 technetium (99mTc), and fluorescent labels, such as fluorescein and rhodamine, and 
biotin. 

In addition to assaying secreted protein levels in a biological sample, proteins 
can also be detected in vivo by imaging. Antibody labels or markers for in vivo 
imaging of protein include those detectable by X-radiography, NMR or ESR. For X- 
30 radiography, suitable labels include radioisotopes such as barium or cesium, which emit 
detectable radiation but are not overtly harmful to the subject. Suitable markers for 
NMR and ESR include those with a detectable characteristic spin, such as deuterium, 
which may be incorporated into the antibody by labeling of nutrients for the relevant 
hybridoma. 

35 A protein-specific antibody or antibody fragment which has been labeled with 

an appropriate detectable imaging moiety, such as a radioisotope (for example, 1311, 
1 12In, 99mTc), a radio-opaque substance, or a material detectable by nuclear magnetic 
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resonance, is introduced (for example, parenterally. subcutaneously, or 
intraperitoneally) into the mammal. It will be understood in the art that the size of the 
subject and the imaging system used will determine the quantity of imaging moiety 
needed to produce diagnostic images. In the case of a radioisotope moiety, for a human 
5 subject, the quantity of radioactivity injected will normally range from about 5 to 20 
millicuries of 99mTc. The labeled antibody or antibody fragment will then 
preferentially accumulate at the location of cells which contain the specific protein. In 
vivo tumor imaging is described in S.W. Burchiel et al., "Immunopharmacokinetics of 
Radiolabeled Antibodies and Their Fragments." (Chapter 13 in Tumor Imaging: The 

10 Radiochemical Detection of Cancer, S.W. Burchiel and B. A. Rhodes, eds., Masson 
Publishing Inc. (1982).) 

Thus, the invention provides a diagnostic method of a disorder, which involves 
(a) assaying the expression of a polypeptide of the present invention in cells or body 
fluid of an individual; (b) comparing the level of gene expression with a standard gene 

15 expression level, whereby an increase or decrease in the assayed polypeptide gene 
expression level compared to the standard expression level is indicative of a disorder. 

Moreover, polypeptides of the present invention can be used to treat disease. 
For example, patients can be administered a polypeptide of the present invention in an 
effort to replace absent or decreased levels of the polypeptide (e.g., insulin), to 

20 supplement absent or decreased levels of a different polypeptide (e.g., hemoglobin S 
for hemoglobin B), to inhibit the activity of a polypeptide (e.g., an oncogene), to 
activate the activity of a polypeptide (e.g., by binding to a receptor), to reduce the 
activity of a membrane bound receptor by competing with it for free ligand (e.g., 
soluble TNF receptors used in reducing inflammation), or to bring about a desired 

25 response (e.g., blood vessel growth). 

Similarly, antibodies directed to a polypeptide of the present invention can also 
be used to treat disease. For example, administration of an antibody directed to a 
polypeptide of the present invention can bind and reduce overproduction of the 
polypeptide. Similarly, administration of an antibody can activate the polypeptide, such 

30 as by binding to a polypeptide bound to a membrane (receptor). 

At the very least, the polypeptides of the present invention can be used as 
molecular weight markers on SDS-PAGE gels or on molecular sieve gel filtration 
columns using methods well known to those of skill in the art. Polypeptides can also 
be used to raise antibodies, which in turn are used to measure protein expression from a 

35 recombinant cell, as a way of assessing transformation of the host cell. Moreover, the 
polypeptides of the present invention can be used to test the following biological 
activities. 
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Biological Activities 

The polynucleotides and polypeptides of the present invention can be used in 
assays to test for one or more biological activities. If these polynucleotides and 
5 polypeptides do exhibit activity in a particular assay, it is likely that these molecules 
may be involved in the diseases associated with the biological activity. Thus, the 
polynucleotides and polypeptides could be used to treat the associated disease. 

Immune Activity 

10 A polypeptide or polynucleotide of the present invention may be useful in 

treating deficiencies or disorders of the immune system, by activating or inhibiting the 
proliferation, differentiation, or mobilization (chemotaxis) of immune cells. Immune 
cells develop through a process called hematopoiesis, producing myeloid (platelets, red 
blood cells, neutrophils, and macrophages) and lymphoid (B and T lymphocytes) cells 

15 from pluripotent stem cells. The etiology of these immune deficiencies or disorders 

may be genetic, somatic, such as cancer or some autoimmune disorders, acquired (e.g., 
by chemotherapy or toxins), or infectious. Moreover, a polynucleotide or polypeptide 
of the present invention can be used as a marker or detector of a particular immune 
system disease or disorder. 

20 A polynucleotide or polypeptide of the present invention may be useful in 

treating or detecting deficiencies or disorders of hematopoietic cells. A polypeptide or 
polynucleotide of the present invention could be used to increase differentiation and 
proliferation of hematopoietic cells, including the pluripotent stem cells, in an effort to 
treat those disorders associated with a decrease in certain (or many) types hematopoietic 

25 cells. Examples of immunologic deficiency syndromes include, but are not limited to: 
blood protein disorders (e.g. agammaglobulinemia, dysgammaglobulinemia), ataxia 
telangiectasia, common variable immunodeficiency, Digeorge Syndrome, HIV 
infection, HTLV-BLV infection, leukocyte adhesion deficiency syndrome, 
lymphopenia, phagocyte bactericidal dysfunction, severe combined immunodeficiency 

30 (SCIDs), Wiskott-Aldrich Disorder, anemia, thrombocytopenia, or hemoglobinuria. 

Moreover, a polypeptide or polynucleotide of the present invention could also 
be used to modulate hemostatic (the stopping of bleeding) or thrombolytic activity (clot 
formation). For example, by increasing hemostatic or thrombolytic activity, a 
polynucleotide or polypeptide of the present invention could be used to treat blood 

35 coagulation disorders (e.g., afibrinogenemia, factor deficiencies), blood platelet 

disorders (e.g. thrombocytopenia), or wounds resulting from trauma, surgery, or other 
causes. Alternatively, a polynucleotide or polypeptide of the present invention that can 
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decrease hemostatic or thrombolytic activity could be used to inhibit or dissolve 
clotting. These molecules could be important in the treatment of heart attacks 
(infarction), strokes, or scarring. 

A polynucleotide or polypeptide of the present invention may also be useful in 
5 treating or detecting autoimmune disorders. Many autoimmune disorders result from 
inappropriate recognition of self as foreign material by immune cells. This 
inappropriate recognition results in an immune response leading to the destruction of the 
host tissue. Therefore, the administration of a polypeptide or polynucleotide of the 
present invention that inhibits an immune response, particularly the proliferation, 

10 differentiation, or chemotaxis of T-cells, may be an effective therapy in preventing 
autoimmune disorders. 

Examples of autoimmune disorders that can be treated or detected by the present 
invention include, but are not limited to: Addison's Disease, hemolytic anemia, 
antiphospholipid syndrome, rheumatoid arthritis, dermatitis, allergic encephalomyelitis, 

15 glomerulonephritis, Goodpasture's Syndrome, Graves' Disease, Multiple Sclerosis, 
Myasthenia Gravis, Neuritis, Ophthalmia, Bullous Pemphigoid, Pemphigus, 
Polyendocrinopathies, Purpura, Reiter's Disease, Stiff-Man Syndrome, Autoimmune 
Thyroiditis, Systemic Lupus Erythematosus, Autoimmune Pulmonary Inflammation. 
Guillain-Barre Syndrome, insulin dependent diabetes mellitis, and autoimmune 

20 inflammatory eye disease. 

Similarly, allergic reactions and conditions, such as asthma (particularly allergic 
asthma) or other respiratory problems, may also be treated by a polypeptide or 
polynucleotide of the present invention. Moreover, these molecules can be used to treat 
anaphylaxis, hypersensitivity to an antigenic molecule, or blood group incompatibility. 

25 A polynucleotide or polypeptide of the present invention may also be used to 

treat and/or prevent organ rejection or graft-versus-host disease (GVHD). Organ 
rejection occurs by host immune cell destruction of the transplanted tissue through an 
immune response. Similarly, an immune response is also involved in GVHD, but, in 
this case, the foreign transplanted immune cells destroy the host tissues. The 

30 administration of a polypeptide or polynucleotide of the present invention that inhibits 
an immune response, particularly the proliferation, differentiation, or chemotaxis of T- 
cells, may be an effective therapy in preventing organ rejection or GVHD. 

Similarly, a polypeptide or polynucleotide of the present invention may also be 
used to modulate inflammation. For example, the polypeptide or polynucleotide may 

35 inhibit the proliferation and differentiation of cells involved in an inflammatory 

response. These molecules can be used to treat inflammatory conditions, both chronic 
and acute conditions, including inflammation associated with infection (e.g.. septic 
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shock, sepsis, or systemic inflammatory* response syndrome (SIRS)), ischemia- 
reperfusion injury, endotoxin lethality, arthritis, complement-mediated hyperacute 
rejection, nephritis, cytokine or chemokine induced lung injury, inflammatory bowel 
disease. Crohn's disease, or resulting from over production of cytokines (e.g., TNF or 
5 IL-1.) 



Hyperproliferative Disorders 

A polypeptide or polynucleotide can be used to treat or detect hyperproliferative 
disorders, including neoplasms. A polypeptide or polynucleotide of the present 

10 invention may inhibit the proliferation of the disorder through direct or indirect 

interactions. Alternatively, a polypeptide or polynucleotide of the present invention 
may proliferate other cells which can inhibit the hyperproliferative disorder. 

For example, by increasing an immune response, particularly increasing 
antigenic qualities of the hyperproliferative disorder or by proliferating, differentiating, 

1 5 or mobilizing T-cells, hyperproliferative disorders can be treated. This immune 

response may be increased by either enhancing an existing immune response, or by 
initiating a new immune response. Alternatively, decreasing an immune response may 
also be a method of treating hyperproliferative disorders, such as a chemotherapeutic 
agent. 

20 Examples of hyperproliferative disorders that can be treated or detected by a 

polynucleotide or polypeptide of the present invention include, but are not limited to 
neoplasms located in the: abdomen, bone, breast, digestive system, liver, pancreas, 
peritoneum, endocrine glands (adrenal, parathyroid, pituitary, testicles, ovary, thymus, 
thyroid), eye, head and neck, nervous (central and peripheral), lymphatic system, 

25 pelvic, skin, soft tissue, spleen, thoracic, and urogenital. 

Similarly, other hyperproliferative disorders can also be treated or detected by a 
polynucleotide or polypeptide of the present invention. Examples of such 
hyperproliferative disorders include, but are not limited to: hypergammaglobulinemia, 
lymphoproliferative disorders, paraproteinemias, purpura, sarcoidosis, Sezary 

30 Syndrome, Waldenstron's Macroglobulinemia, Gaucher\s Disease, histiocytosis, and 
any other hyperproliferative disease, besides neoplasia, located in an organ system 
listed above. 



Infectious Disease 

A polypeptide or polynucleotide of the present invention can be used to treat or 
detect infectious agents. For example, by increasing the immune response, particularly 
increasing the proliferation and differentiation of B and/or T cells, infectious diseases 
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may be treated. The immune response may be increased by either enhancing an existing 
immune response, or by initiating a new immune response. Alternatively, the 
polypeptide or polynucleotide of the present invention may also directly inhibit the 
infectious agent, without necessarily eliciting an immune response. 
5 Viruses are one example of an infectious agent that can cause disease or 

symptoms that can be treated or detected by a polynucleotide or polypeptide of the 
present invention. Examples of viruses, include, but are not limited to the following 
DNA and RNA viral families: Arbovirus, Adenoviridae, Arenaviridae, Arterivirus, 
Birnaviridae, Bunyaviridae, Caliciviridae, Circovindae, Coronavindae, Flaviviridae, 

10 Hepadnavindae (Hepatitis), Herpesviridae (such as, Cytomegalovirus, Herpes 
Simplex, Herpes Zoster), Mononegavirus (e.g., Paramyxoviridae, Morbillivirus, 
Rhabdoviridae), Orthomyxoviridae (e.g., Influenza), Papovaviridae, Parvoviridae, 
Picornaviridae, Poxviridae (such as Smallpox or Vaccinia), Reoviridae (e.g.. 
Rotavirus), Retroviridae (HTLV-I, HTLV-II, Lentivinis), and Togaviridae (e.g., 

15 Rubi virus). Viruses falling within these families can cause a variety of diseases or 
symptoms, including, but not limited to: arthritis, bronchiollitis, encephalitis, eye 
infections (e.g., conjunctivitis, keratitis), chronic fatigue syndrome, hepatitis (A, B, C, 
E, Chronic Active, Delta), meningitis, opportunistic infections (e.g., AIDS), 
pneumonia, Burkitt's Lymphoma, chickenpox , hemorrhagic fever, Measles, Mumps, 

20 Parainfluenza, Rabies, the common cold, Polio, leukemia, Rubella, sexually 

transmitted diseases, skin diseases (e.g., Kaposi's, warts), and viremia. A polypeptide 
or polynucleotide of the present invention can be used to treat or detect any of these 
symptoms or diseases. 

Similarly, bacterial or fungal agents that can cause disease or symptoms and that 

25 can be treated or detected by a polynucleotide or polypeptide of the present invention 
include, but not limited to, the following Gram-Negative and Gram-positive bacterial 
families and fungi: Actinomycetales (e.g., Corynebacterium, Mycobacterium, 
Norcardia), Aspergillosis, Bacillaceae (e.g., Anthrax, Clostridium), Bacteroidaceae, 
Blastomycosis, Bordetella, Borrelia, Brucellosis, Candidiasis, Campylobacter, 

30 Coccidioidomycosis, Cryptococcosis, Dermatocycoses, Enterobacteriaceae (Klebsiella, 
Salmonella, Serratia, Yersinia), Erysipelothrix, Helicobacter, Legioneilosis, 
Leptospirosis, Listeria, Mycoplasmatales, Neisseriaceae (e.g., Acinetobacter, 
Gonorrhea, Menigococcal), Pasteurellacea Infections (e.g., Actinobacillus, 
Heamophilus, Pasteurella), Pseudomonas, Rickettsiaceae, Chlamydiaceae, Syphilis, 

35 and Staphylococcal. These bacterial or fungal families can cause the following diseases 
or symptoms, including, but not limited to: bacteremia, endocarditis, eye infections 
(conjunctivitis, tuberculosis, uveitis), gingivitis, opportunistic infections (e.g.. AIDS 
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related infections), paronychia, prosthesis-related infections, Reiter's Disease, 
respirator tract infections, such as Whooping Cough or Empyema, sepsis, Lyme 
Disease, Cat-Scratch Disease, Dysentery, Paratyphoid Fever, food poisoning. 
Typhoid, pneumonia. Gonorrhea, meningitis. Chlamydia, Syphilis, Diphtheria, 
5 Leprosy, Paratuberculosis, Tuberculosis, Lupus, Botulism, gangrene, tetanus, 

impetigo. Rheumatic Fever, Scarlet Fever, sexually transmitted diseases; skin diseases 
(e.g., cellulitis, dermatocycoses), toxemia, urinary tract infections, wound infections. 
A polypeptide or polynucleotide of the present invention can be used to treat or detect 
any of these symptoms or diseases. 

10 Moreover, parasitic agents causing disease or symptoms that can be treated or 

detected by a polynucleotide or polypeptide of the present invention include, but not 
limited to, the following families: Amebiasis, Babesiosis, Coccidiosis, 
Cryptosporidiosis, Dientamoebiasis, Dourine. Ectoparasitic, Giardiasis, Helminthiasis, 
Leishmaniasis, Theileriasis, Toxoplasmosis, Trypanosomiasis, and Trichomonas. 

15 These parasites can cause a variety of diseases or symptoms, including, but not limited 
to: Scabies, Trombiculiasis, eye infections, intestinal disease (e.g., dysentery, 
giardiasis), liver disease, lung disease, opportunistic infections (e.g., AIDS related), 
Malaria, pregnancy complications, and toxoplasmosis. A polypeptide or polynucleotide 
of the present invention can be used to treat or detect any of these symptoms or 

20 diseases. 

Preferably, treatment using a polypeptide or polynucleotide of the present 
invention could either be by administering an effective amount of a polypeptide to the 
patient, or by removing cells from the patient, supplying the cells with a polynucleotide 
of the present invention, and returning the engineered cells to the patient (ex vivo 
25 therapy). Moreover, the polypeptide or polynucleotide of the present invention can be 
used as an antigen in a vaccine to raise an immune response against infectious disease. 

Regeneration 

A polynucleotide or polypeptide of the present invention can be used to 
30 differentiate, proliferate, and attract cells, leading to the regeneration of tissues. (See, 
Science 276:59-87 (1997).) The regeneration of tissues could be used to repair, 
replace, or protect tissue damaged by congenital defects, trauma (wounds, burns, 
incisions, or ulcers), age, disease (e.g. osteoporosis, osteocarthritis, periodontal 
disease, liver failure), surgery, including cosmetic plastic surgery, fibrosis, reperfusion 
35 injury, or systemic cytokine damage. 

Tissues that could be regenerated using the present invention include organs 
(e.g.. pancreas, liver, intestine, kidney, skin, endothelium), muscle (smooth, skeletal 
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or cardiac), vascular (including vascular endothelium), nervous, hematopoietic, and 
skeletal (bone, cartilage, tendon, and ligament) tissue. Preferably, regeneration occurs 
without or decreased scarring. Regeneration also may include angiogenesis. 

Moreover, a polynucleotide or polypeptide of the present invention may increase 
5 regeneration of tissues difficult to heal. For example, increased tendon/ligament 
regeneration would quicken recovery time after damage. A polynucleotide or 
polypeptide of the present invention could also be used prophylactically in an effort to 
avoid damage. Specific diseases that could be treated include of tendinitis, carpal tunnel 
syndrome, and other tendon or ligament defects. A further example of tissue 

10 regeneration of non-healing wounds includes pressure ulcers, ulcers associated with 
vascular insufficiency, surgical, and traumatic wounds. 

Similarly, nerve and brain tissue could also be regenerated by using a 
polynucleotide or polypeptide of the present invention to proliferate and differentiate 
nerve cells. Diseases that could be treated using this method include central and 

15 peripheral nervous system diseases, neuropathies, or mechanical and traumatic 
disorders (e.g., spinal cord disorders, head trauma, cerebrovascular disease, and 
stoke). Specifically, diseases associated with peripheral nerve injuries, peripheral 
neuropathy (e.g., resulting from chemotherapy or other medical therapies), localized 
neuropathies, and central nervous system diseases (e.g., Alzheimer's disease, 

20 Parkinson's disease, Huntington's disease, amyotrophic lateral sclerosis, and Shy- 
Drager syndrome), could all be treated using the polynucleotide or polypeptide of the 
present invention. 

Chemotaxis 

25 A polynucleotide or polypeptide of the present invention may have chemotaxis 

activity. A chemotaxic molecule attracts or mobilizes cells (e.g., monocytes, 
fibroblasts, neutrophils, T-cells. mast cells, eosinophils, epithelial and/or endothelial 
cells) to a particular site in the body, such as inflammation, infection, or site of 
hyperproliferation. The mobilized cells can then fight off and/or heal the particular 

30 trauma or abnormality. 

A polynucleotide or polypeptide of the present invention may increase 
chemotaxic activity of particular cells. These chemotactic molecules can then be used to 
treat inflammation, infection, hyperproliferative disorders, or any immune system 
disorder by increasing the number of cells targeted to a particular location in the body. 

35 For example, chemotaxic molecules can be used to treat wounds and other trauma to 
tissues by attracting immune cells to the injured location. Chemotactic molecules of the 
present invention can also attract fibroblasts, which can be used to treat wounds. 
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It is also contemplated that a polynucleotide or polypeptide of the present 
invention may inhibit chemotactic activity. These molecules could also be used to treat 
disorders. Thus, a polynucleotide or polypeptide of the present invention could be used 
as an inhibitor of chemotaxis. 

5 

Binding Activity 

A polypeptide of the present invention may be used to screen for molecules that 
bind to the polypeptide or for molecules to which the polypeptide binds. The binding 
of the polypeptide and the molecule may activate (agonist), increase, inhibit 
10 (antagonist), or decrease activity of the polypeptide or the molecule bound. Examples 
of such molecules include antibodies, oligonucleotides, proteins (e.g., receptors), or 
small molecules. 

Preferably, the molecule is closely related to the natural ligand of the 
polypeptide, e.g., a fragment of the ligand, or a natural substrate, a ligand, a structural 

15 or functional mimetic. (See. Coligan et al., Current Protocols in Immunology 

l(2):Chapter 5 (1991).) Similarly, the molecule can be closely related to the natural 
receptor to which the polypeptide binds, or at least, a fragment of the receptor capable 
of being bound by the polypeptide (e.g., active site). In either case, the molecule can 
be rationally designed using known techniques. 

20 Preferably, the screening for these molecules involves producing appropriate 

cells which express the polypeptide, either as a secreted protein or on the cell 
membrane. Preferred cells include cells from mammals, yeast, Drosophila, or E. coli. 
Cells expressing the polypeptide (or cell membrane containing the expressed 
polypeptide) are then preferably contacted with a test compound potentially containing 

25 the molecule to observe binding, stimulation, or inhibition of activity of either the 
polypeptide or the molecule. 

The assay may simply test binding of a candidate compound to the polypeptide, 
wherein binding is detected by a label, or in an assay involving competition with a 
labeled competitor. Further, the assay may test whether the candidate compound results 

30 in a signal generated by binding to the polypeptide. 

Alternatively, the assay can be carried out using cell-free preparations, 
polypeptide/molecule affixed to a solid support, chemical libraries, or natural product 
mixtures. The assay may also simply comprise the steps of mixing a candidate 
compound with a solution containing a polypeptide, measuring polypeptide/molecule 

35 activity or binding, and comparing the polypeptide/molecule activity or binding to a 
standard. 
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Preferably, an ELISA assay can measure polypeptide level or activity in a 
sample (e.g., biological sample) using a monoclonal or polyclonal antibody. The 
antibody can measure polypeptide level or activity by either binding, directly or 
indirectly, to the polypeptide or by competing with the polypeptide for a substrate. 
5 All of these above assays can be used as diagnostic or prognostic markers. The 

molecules discovered using these assays can be used to treat disease or to bring about a 
particular result in a patient (e.g., blood vessel growth) by activating or inhibiting the 
polypeptide/molecule. Moreover, the assays can discover agents which may inhibit or 
enhance the production of the polypeptide from suitably manipulated cells or tissues. 

10 Therefore, the invention includes a method of identifying compounds which 

bind to a polypeptide of the invention comprising the steps of: (a) incubating a 
candidate binding compound with a polypeptide of the invention; and (b) determining if 
binding has occurred. Moreover, the invention includes a method of identifying 
agonists/antagonists comprising the steps of: (a) incubating a candidate compound with 

15 a polypeptide of the invention, (b) assaying a biological activity , and (b) determining if 
a biological activity of the polypeptide has been altered. 

Other Activities 

A polypeptide or polynucleotide of the present invention may also increase or 
20 decrease the differentiation or proliferation of embryonic stem cells, besides, as 

discussed above, hematopoietic lineage. 

A polypeptide or polynucleotide of the present invention may also be used to 

modulate mammalian characteristics, such as body height, weight, hair color, eye color, 

skin, percentage of adipose tissue, pigmentation, size, and shape (e.g., cosmetic 
25 surgery). Similarly, a polypeptide or polynucleotide of the present invention may be 

used to modulate mammalian metabolism affecting catabolism, anabolism, processing, 

utilization, and storage of energy. 

A polypeptide or polynucleotide of the present invention may be used to change 

a mammal's mental state or physical state by influencing biorhythms, caricadic 
30 rhythms, depression (including depressive disorders), tendency for violence, tolerance 

for pain, reproductive capabilities (preferably by Activin or Inhibin-like activity), 

hormonal or endocrine levels, appetite, libido, memory, stress, or other cognitive 

qualities. 

A polypeptide or polynucleotide of the present invention may also be used as a 
35 food additive or preservative, such as to increase or decrease storage capabilities, fat 
content, lipid, protein, carbohydrate, vitamins, minerals, cofactors or other nutritional 
components. 
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Other Preferred Embodiments 

Other preferred embodiments of the claimed invention include an isolated 
nucleic acid molecule comprising a nucleotide sequence which is at least 95% identical 
5 to a sequence of at least about 50 contiguous nucleotides in the nucleotide sequence of 
SEQ ID NO:X wherein X is any integer as defined in Table 1. 

Also preferred is a nucleic acid molecule wherein said sequence of contiguous 
nucleotides is included in the nucleotide sequence of SEQ ID NO:X in the range of 
positions beginning with the nucleotide at about the position of the 5' Nucleotide of the 
10 Clone Sequence and ending with the nucleotide at about the position of the 3' 
Nucleotide of the Clone Sequence as defined for SEQ ID NO:X in Table 1 . 

Also preferred is a nucleic acid molecule wherein said sequence of contiguous 
nucleotides is included in the nucleotide sequence of SEQ ID NO:X in the range of 
positions beginning with the nucleotide at about the position of the 5 1 Nucleotide of the 
15 Start Codon and ending with the nucleotide at about the position of the 3' Nucleotide of 
the Clone Sequence as defined for SEQ ID NO:X in Table 1 . 

Similarly preferred is a nucleic acid molecule wherein said sequence of 
contiguous nucleotides is included in the nucleotide sequence of SEQ ID NO:X in the 
range of positions beginning with the nucleotide at about the position of the 5' 
20 Nucleotide of the First Amino Acid of the Signal Peptide and ending with the nucleotide 
at about the position of the 3' Nucleotide of the Clone Sequence as defined for SEQ ID 
NO:Xin Table 1. 

Also preferred is an isolated nucleic acid molecule comprising a nucleotide 
sequence which is at least 95% identical to a sequence of at least about 150 contiguous 
25 nucleotides in the nucleotide sequence of SEQ ID NO:X. 

Further preferred is an isolated nucleic acid molecule comprising a nucleotide 
sequence which is at least 95% identical to a sequence of at least about 500 contiguous 
nucleotides in the nucleotide sequence of SEQ ID NO:X. 



30 nucleotide sequence which is at least 95% identical to the nucleotide sequence of SEQ 
ID NO:X beginning with the nucleotide at about the position of the 5' Nucleotide of the 
First Amino Acid of the Signal Peptide and ending with the nucleotide at about the 
position of the 3' Nucleotide of the Clone Sequence as defined for SEQ ID NO:X in 
Table 1. 



A further preferred embodiment is a nucleic acid molecule comprising a 
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A further preferred embodiment is an isolated nucleic acid molecule comprising 
a nucleotide sequence which is at least 95% identical to the complete nucleotide 
sequence of SEQ ID NO:X. 

Also preferred is an isolated nucleic acid molecule which hybridizes under 
5 stringent hybridization conditions to a nucleic acid molecule, wherein said nucleic acid 
molecule which hybridizes does not hybridize under stringent hybridization conditions 
to a nucleic acid molecule having a nucleotide sequence consisting of only A residues or 
of only T residues. 

Also preferred is a composition of matter comprising a DNA molecule which 
10 comprises a human cDNA clone identified by a cDNA Clone Identifier in Table 1, 
which DNA molecule is contained in the material deposited with the American Type 
Culture Collection and given the ATCC Deposit Number shown in Table 1 for said 
cDNA Clone Identifier. 

Also preferred is an isolated nucleic acid molecule comprising a nucleotide 
15 sequence which is at least 95% identical to a sequence of at least 50 contiguous 

nucleotides in the nucleotide sequence of a human cDNA clone identified by a cDNA 
Clone Identifier in Table 1, which DNA molecule is contained in the deposit given the 
ATCC Deposit Number shown in Table 1. 

Also preferred is an isolated nucleic acid molecule, wherein said sequence of at 
20 least 50 contiguous nucleotides is included in the nucleotide sequence of the complete 
open reading frame sequence encoded by said human cDNA clone. 

Also preferred is an isolated nucleic acid molecule comprising a nucleotide 
sequence which is at least 95% identical to sequence of at least 150 contiguous 
nucleotides in the nucleotide sequence encoded by said human cDNA clone. 
25 A further preferred embodiment is an isolated nucleic acid molecule comprising 

a nucleotide sequence which is at least 95% identical to sequence of at least 500 
contiguous nucleotides in the nucleotide sequence encoded by said human cDNA clone. 

A further preferred embodiment is an isolated nucleic acid molecule comprising 
a nucleotide sequence which is at least 95% identical to the complete nucleotide 
30 sequence encoded by said human cDNA clone. 

A further preferred embodiment is a method for detecting in a biological sample 
a nucleic acid molecule comprising a nucleotide sequence which is at least 95% identical 
to a sequence of at least 50 contiguous nucleotides in a sequence selected from the 
group consisting of: a nucleotide sequence of SEQ ID NO:X wherein X is any integer 
35 as defined in Table 1 ; and a nucleotide sequence encoded by a human cDNA clone 

identified by a cDNA Clone Identifier in Table 1 and contained in the deposit with the 
ATCC Deposit Number shown for said cDN A clone in Table 1 ; which method 
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comprises a step of comparing a nucleotide sequence of at least one nucleic acid 
molecule in said sample with a sequence selected from said group and determining 
whether the sequence of said nucleic acid molecule in said sample is at least 95% 
identical to said selected sequence. 
5 Also preferred is the above method wherein said step of comparing sequences 

comprises determining the extent of nucleic acid hybridization between nucleic acid 
molecules in said sample and a nucleic acid molecule comprising said sequence selected 
from said group. Similarly, also preferred is the above method wherein said step of 
comparing sequences is performed by comparing the nucleotide sequence determined 

10 from a nucleic acid molecule in said sample with said sequence selected from said 

group. The nucleic acid molecules can comprise DNA molecules or RNA molecules. 

A further preferred embodiment is a method for identifying the species, tissue or 
cell type of a biological sample which method comprises a step of detecting nucleic acid 
molecules in said sample, if any, comprising a nucleotide sequence that is at least 95% 

15 identical to a sequence of at least 50 contiguous nucleotides in a sequence selected from 
the group consisting of: a nucleotide sequence of SEQ ID NO:X wherein X is any 
integer as defined in Table 1 ; and a nucleotide sequence encoded by a human cDN A 
clone identified by a cDNA Clone Identifier in Table 1 and contained in the deposit with 
the ATCC Deposit Number shown for said cDNA clone in Table 1 . 

20 The method for identifying the species, tissue or cell type of a biological sample 

can comprise a step of detecting nucleic acid molecules comprising a nucleotide 
sequence in a panel of at least two nucleotide sequences, wherein at least one sequence 
in said panel is at least 95% identical to a sequence of at least 50 contiguous nucleotides 
in a sequence selected from said group. 

25 Also preferred is a method for diagnosing in a subject a pathological condition 

associated with abnormal structure or expression of a gene encoding a secreted protein 
identified in Table 1, which method comprises a step of detecting in a biological sample 
obtained from said subject nucleic acid molecules, if any, comprising a nucleotide 
sequence that is at least 95% identical to a sequence of at least 50 contiguous 

30 nucleotides in a sequence selected from the group consisting of: a nucleotide sequence 
of SEQ ID NO:X wherein X is any integer as defined in Table 1 ; and a nucleotide 
sequence encoded by a human cDNA clone identified by a cDNA Clone Identifier in 
Table 1 and contained in the deposit with the ATCC Deposit Number shown for said 
cDNA clone in Table 1. 

35 The method for diagnosing a pathological condition can comprise a step of 

detecting nucleic acid molecules comprising a nucleotide sequence in a panel of at least 
two nucleotide sequences, wherein at least one sequence in said panel is at least 95% 
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identical to a sequence of at least 50 contiguous nucleotides in a sequence selected from 
said group. 

Also preferred is a composition of matter comprising isolated nucleic acid 

molecules wherein the nucleotide sequences of said nucleic acid molecules comprise a 
5 panel of at least two nucleotide sequences, wherein at least one sequence in said panel is 

at least 95% identical to a sequence of at least 50 contiguous nucleotides in a sequence 

selected from the group consisting of: a nucleotide sequence of SEQ ID NO:X wherein 

X is any integer as defined in Table 1 ; and a nucleotide sequence encoded by a human 

cDN A clone identified by a cDNA Clone Identifier in Table 1 and contained in the 
10 deposit with the ATCC Deposit Number shown for said cDNA clone in Table 1. The 

nucleic acid molecules can comprise DNA molecules or RNA molecules. 

Also preferred is an isolated polypeptide comprising an amino acid sequence at 

least 90% identical to a sequence of at least about 10 contiguous amino acids in the 

amino acid sequence of SEQ ID NO:Y wherein Y is any integer as defined in Table 1. 
15 Also preferred is a polypeptide, wherein said sequence of contiguous amino 

acids is included in the amino acid sequence of SEQ ID NO: Y in the range of positions 

beginning with the residue at about the position of the First Amino Acid of the Secreted 

Portion and ending with the residue at about the Last Amino Acid of the Open Reading 

Frame as set forth for SEQ ID NO:Y in Table 1. 
20 Also preferred is an isolated polypeptide comprising an amino acid sequence at 

least 95% identical to a sequence of at least about 30 contiguous amino acids in the 

amino acid sequence of SEQ ID NO: Y. 

Further preferred is an isolated polypeptide comprising an amino acid sequence 

at least 95% identical to a sequence of at least about 100 contiguous amino acids in the 
25 amino acid sequence of SEQ ID NO: Y. 

Further preferred is an isolated polypeptide comprising an amino acid sequence 

at least 95% identical to the complete amino acid sequence of SEQ ID NO:Y. 

Further preferred is an isolated polypeptide comprising an amino acid sequence 

at least 90% identical to a sequence of at least about 10 contiguous amino acids in the 
30 complete amino acid sequence of a secreted protein encoded by a human cDNA clone 

identified by a cDN A Clone Identifier in Table 1 and contained in the deposit with the 

ATCC Deposit Number shown for said cDNA clone in Table 1 . 

Also preferred is a polypeptide wherein said sequence of contiguous amino 

acids is included in the amino acid sequence of a secreted portion of the secreted protein 
35 encoded by a human cDNA clone identified by a cDN A Clone Identifier in Table 1 and 

contained in the deposit with the ATCC Deposit Number shown for said cDNA clone in 

Table 1. 
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Also preferred is an isolated polypeptide comprising an amino acid sequence at 
least 95% identical to a sequence of at least about 30 contiguous amino acids in the 
amino acid sequence of the secreted portion of the protein encoded by a human cDNA 
clone identified by a cDNA Clone Identifier in Table 1 and contained in the deposit with 
5 the ATCC Deposit Number shown for said cDN A clone in Table 1 . 

Also preferred is an isolated polypeptide comprising an amino acid sequence at 
least 95% identical to a sequence of at least about 100 contiguous amino acids in the 
amino acid sequence of the secreted portion of the protein encoded by a human cDNA 
clone identified by a cDNA Clone Identifier in Table 1 and contained in the deposit with 
10 the ATCC Deposit Number shown for said cDNA clone in Table 1 . 

Also preferred is an isolated polypeptide comprising an amino acid sequence at 
least 95% identical to the amino acid sequence of the secreted portion of the protein 
encoded by a human cDNA clone identified by a cDNA Clone Identifier in Table 1 and 
contained in the deposit with the ATCC Deposit Number shown for said cDN A clone in 
15 Table 1. 

Further preferred is an isolated antibody which binds specifically to a 
polypeptide comprising an amino acid sequence that is at least 90% identical to a 
sequence of at least 10 contiguous amino acids in a sequence selected from the group 
consisting of: an amino acid sequence of SEQ ID NO: Y wherein Y is any integer as 

20 defined in Table 1 ; and a complete amino acid sequence of a protein encoded by a 

human cDNA clone identified by a cDNA Clone Identifier in Table 1 and contained in 
the deposit with the ATCC Deposit Number shown for said cDNA clone in Table 1 . 

Further preferred is a method for detecting in a biological sample a polypeptide 
comprising an amino acid sequence which is at least 90% identical to a sequence of at 

25 least 10 contiguous amino acids in a sequence selected from the group consisting of: an 
amino acid sequence of SEQ ID NO: Y wherein Y is any integer as defined in Table 1 ; 
and a complete amino acid sequence of a protein encoded by a human cDNA clone 
identified by a cDNA Clone Identifier in Table 1 and contained in the deposit with the 
ATCC Deposit Number shown for said cDNA clone in Table 1 ; which method 

30 comprises a step of comparing an amino acid sequence of at least one polypeptide 
molecule in said sample with a sequence selected from said group and determining 
whether the sequence of said polypeptide molecule in said sample is at least 90% 
identical to said sequence of at least 10 contiguous amino acids. 

Also preferred is the above method wherein said step of comparing an amino 

35 acid sequence of at least one polypeptide molecule in said sample with a sequence 
selected from said group comprises determining the extent of specific binding of 
polypeptides in said sample to an antibody which binds specifically to a polypeptide 
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comprising an amino acid sequence that is at least 90% identical to a sequence of at least 
10 contiguous amino acids in a sequence selected from the group consisting of: an 
amino acid sequence of SEQ ID NO: Y wherein Y is any integer as defined in Table 1 ; 
and a complete amino acid sequence of a protein encoded by a human cDN A clone 
5 identified by a cDNA Clone Identifier in Table 1 and contained in the deposit with the 
ATCC Deposit Number shown for said cDN A clone in Table 1 . 

Also preferred is the above method wherein said step of comparing sequences is 
performed by comparing the amino acid sequence determined from a polypeptide 
molecule in said sample with said sequence selected from said group. 

10 Also preferred is a method for identifying the species, tissue or cell type of a 

biological sample which method comprises a step of detecting polypeptide molecules in 
said sample, if any, comprising an amino acid sequence that is at least 90% identical to 
a sequence of at least 10 contiguous amino acids in a sequence selected from the group 
consisting of: an amino acid sequence of SEQ ID NO: Y wherein Y is any integer as 

1 5 defined in Table 1 ; and a complete amino acid sequence of a secreted protein encoded 
by a human cDNA clone identified by a cDNA Clone Identifier in Table 1 and contained 
in the deposit with the ATCC Deposit Number shown for said cDN A clone in Table 1 . 

Also preferred is the above method for identifying the species, tissue or cell type 
of a biological sample, which method comprises a step of detecting polypeptide 

20 molecules comprising an amino acid sequence in a panel of at least two amino acid 
sequences, wherein at least one sequence in said panel is at least 90% identical to a 
sequence of at least 10 contiguous amino acids in a sequence selected from the above 
group. 

Also preferred is a method for diagnosing in a subject a pathological condition 
25 associated with abnormal structure or expression of a gene encoding a secreted protein 
identified in Table 1, which method comprises a step of detecting in a biological sample 
obtained from said subject polypeptide molecules comprising an amino acid sequence in 
a panel of at least two amino acid sequences, wherein at least one sequence in said panel 
is at least 90% identical to a sequence of at least 10 contiguous amino acids in a 
30 sequence selected from the group consisting of: an amino acid sequence of SEQ ID 
NO: Y wherein Y is any integer as defined in Table 1 ; and a complete amino acid 
sequence of a secreted protein encoded by a human cDNA clone identified by a cDNA 
Clone Identifier in Table 1 and contained in the deposit with the ATCC Deposit Number 
shown for said cDNA clone in Table 1 . 
35 In any of these methods, the step of detecting said polypeptide molecules 

includes using an antibody. 
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Also preferred is an isolated nucleic acid molecule comprising a nucleotide 
sequence which is at least 95% identical to a nucleotide sequence encoding a 
polypeptide wherein said polypeptide comprises an amino acid sequence that is at least 
90% identical to a sequence of at least 10 contiguous ammo acids in a sequence selected 
5 from the group consisting of: an amino acid sequence of SEQ ID NO: Y wherein Y is 
any integer as defined in Table 1 ; and a complete amino acid sequence of a secreted 
protein encoded by a human cDNA clone identified by a cDNA Clone Identifier in Table 
1 and contained in the deposit with the ATCC Deposit Number shown for said cDNA 
clone in Table 1. 

10 Also preferred is an isolated nucleic acid molecule, wherein said nucleotide 

sequence encoding a polypeptide has been optimized for expression of said polypeptide 
in a prokaryotic host. 

Also preferred is an isolated nucleic acid molecule, wherein said polypeptide 
comprises an amino acid sequence selected from the group consisting of: an amino acid 

15 sequence of SEQ ID NO: Y wherein Y is any integer as defined in Table 1 ; and a 

complete amino acid sequence of a secreted protein encoded by a human cDNA clone 
identified by a cDNA Clone Identifier in Table 1 and contained in the deposit with the 
ATCC Deposit Number shown for said cDNA clone in Table 1 . 

Further preferred is a method of making a recombinant vector comprising 

20 inserting any of the above isolated nucleic acid molecule into a vector. Also preferred is 
the recombinant vector produced by this method. Also preferred is a method of making 
a recombinant host cell comprising introducing the vector into a host cell, as well as the 
recombinant host cell produced by this method. 

Also preferred is a method of making an isolated polypeptide comprising 

25 culturing this recombinant host cell under conditions such that said polypeptide is 

expressed and recovering said polypeptide. Also preferred is this method of making an 
isolated polypeptide, wherein said recombinant host cell is a eukaryotic cell and said 
polypeptide is a secreted portion of a human secreted protein comprising an amino acid 
sequence selected from the group consisting of: an amino acid sequence of SEQ ID 

30 NO: Y beginning with the residue at the position of the First Amino Acid of the Secreted 
Portion of SEQ ID NO: Y wherein Y is an integer set forth in Table 1 and said position 
of the First Amino Acid of the Secreted Portion of SEQ ID NO:Y is defined in Table 1; 
and an ammo acid sequence of a secreted portion of a protein encoded by a human 
cDNA clone identified by a cDNA Clone Identifier in Table 1 and contained in the 

35 deposit with the ATCC Deposit Number shown for said cDN A clone in Table 1 . The 
isolated polypeptide produced by this method is also preferred. 
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Also preferred is a method of treatment of an individual in need of an increased 
level of a secreted protein activity, which method comprises administering to such an 
individual a pharmaceutical composition comprising an amount of an isolated 
polypeptide, polynucleotide, or antibody of the claimed invention effective to increase 
5 the level of said protein activity in said individual. 

Having generally described the invention, the same will be more readily 
understood by reference to the following examples, which are provided by way of 
illustration and are not intended as limiting. 

10 Examples 

Example 1: Isolation of a Selected cDNA Clone From the Dep osited 
Sample 

Each cDNA clone in a cited ATCC deposit is contained in a plasmid vector. 
15 Table 1 identifies the vectors used to construct the cDNA library from which each clone 
was isolated. In many cases, the vector used to construct the library is a phage vector 
from which a plasmid has been excised. The table immediately below correlates the 
related plasmid for each phage vector used in constructing the cDNA library. For 
example, where a particular clone is identified in Table 1 as being isolated in the vector 
20 "Lambda Zap," the corresponding deposited clone is in "pBluescript." 

Vector Used to Constru ct Library Corresponding Deposited Plasmid 

Lambda Zap pBluescript (pBS) 

Uni-Zap XR pBluescript (pBS) 

Zap Express pBK 
25 lafmid BA plafmid BA 

pSportl pSportl 
pCMVSport 2.0 pCMVSport 2.0 

pCMVSport 3.0 pCMVSport 3.0 

pCR®2.1 pCR c *2.1 
30 Vectors Lambda Zap (U.S. Patent Nos. 5,128,256 and 5,286,636), Uni-Zap 

XR (U.S. Patent Nos. 5,128, 256 and 5,286,636), Zap Express (U.S. Patent Nos. 
5,128,256 and 5,286,636), pBluescript (pBS) (Short. J. M. et al.. Nucleic Acids Res. 
16:7583-7600 (1988); Alting-Mees, M. A. and Short, J. M., Nucleic Acids Res. 
17:9494 (1989)) and pBK (Alting-Mees, M. A. et al., Strategies 5:58-61 (1992)) are 
35 commercially available from Stratagene Cloning Systems, Inc., 11011 N. Torrey Pines 
Road, La Jolla, CA, 92037. pBS contains an ampicillin resistance gene and pBK 
contains a neomycin resistance gene. Both can be transformed into E. coli strain XL-1 
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Blue, also available from Stratagene. pBS comes in 4 forms SK+, SK-. KS+ and KS. 
The S and K refers to the orientation of the polylinker to the T7 and T3 primer 
sequences which flank the polylinker region ("S" is for Sad and "K" is for Kpnl which 
are the first sites on each respective end of the linker). "+" or "-" refer to the orientation 
5 of the f 1 origin of replication fori"), such that in one orientation, single stranded rescue 
initiated from the f 1 ori generates sense strand DNA and in the other, antisense. 

Vectors pSportl, pCMVSport 2.0 and pCMVSport 3.0. were obtained from 
Life Technologies, Inc.. P. O. Box 6009, Gaithersburg, MD 20897. Ail Sport vectors 
contain an ampicillin resistance gene and may be transformed into E. coli strain 

10 DH10B, also available from Life Technologies. (See, for instance, Gruber, C. E., et 
al.. Focus 15:59 (1993).) Vector lafmid BA (Bento Soares. Columbia University, NY) 
contains an ampicillin resistance gene and can be transformed into E. coli strain XL-1 
Blue. Vector pCR®2.1, which is available from Invitrogen, 1600 Faraday Avenue, 
Carlsbad, CA 92008, contains an ampicillin resistance gene and may be transformed 

15 into E. coli strain DH10B, available from Life Technologies. (See, for instance, Clark, 
J. M., Nuc. Acids Res. 16:9677-9686 (1988) and Mead, D. et al., Bio/Technology 9: 
(1991).) Preferably, a polynucleotide of the present invention does not comprise the 
phage vector sequences identified for the particular clone in Table I, as well as the 
corresponding plasmid vector sequences designated above. 

20 The deposited material in the sample assigned the ATCC Deposit Number cited 

in Table 1 for any given cDN A clone also may contain one or more additional plasmids, 
each comprising a cDNA clone different from that given clone. Thus, deposits sharing 
the same ATCC Deposit Number contain at least a plasmid for each cDNA clone 
identified in Table 1. Typically, each ATCC deposit sample cited in Table 1 comprises 

25 a mixture of approximately equal amounts (by weight) of about 50 plasmid DN As, each 
containing a different cDNA clone; but such a deposit sample may include plasmids for 
more or less than 50 cDN A clones, up to about 500 cDNA clones. 

Two approaches can be used to isolate a particular clone from the deposited 
sample of plasmid DNAs cited for that clone in Table 1. First, a plasmid is directly 

30 isolated by screening the clones using a polynucleotide probe corresponding to SEQ ID 
NO:X. 

Particularly, a specific polynucleotide with 30-40 nucleotides is synthesized 
using an Applied Biosystems DNA synthesizer according to the sequence reported. 
The oligonucleotide is labeled, for instance, with 32 P-y-ATP using T4 polynucleotide 
35 kinase and purified according to routine methods. (E.g., Maniatis et al., Molecular 

Cloning: A Laboratory Manual, Cold Spring Harbor Press, Cold Spring, NY (1982).) 
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The plasmid mixture is transformed into a suitable host, as indicated above (such as 
XL- 1 Blue (Stratagene)) using techniques known to those of skill in the art, such as 
those provided by the vector supplier or in related publications or patents cited above. 
The transformants are plated on 1.5% agar plates (containing the appropriate selection 
5 agent, e.g., ampicillin) to a density of about 150 transformants (colonies) per plate. 
These plates are screened using Nylon membranes according to routine methods for 
bacterial colony screening (e.g., Sambrook et al., Molecular Cloning: A Laboratory 
Manual, 2nd Edit., ( 1989), Cold Spring Harbor Laboratory Press, pages 1.93 to 
1 .104), or other techniques known to those of skill in the art. 

1 0 Alternatively, two primers of 1 7-20 nucleotides derived from both ends of the 

SEQ ID NO:X (i.e., within the region of SEQ ID NO:X bounded by the 5' NT and the 
3* NT of the clone defined in Table 1) are synthesized and used to amplify the desired 
cDNA using the deposited cDNA plasmid as a template. The polymerase chain reaction 
is carried out under routine conditions, for instance, in 25 |il of reaction mixture with 

1 5 0.5 ug of the above cDNA template. A convenient reaction mixture is 1 .5-5 mM 

MgCl 2 , 0.01 % (w/v) gelatin, 20 u,M each of dATP, dCTP, dGTP, dTTP, 25 pmol of 
each primer and 0.25 Unit of Taq polymerase. Thirty five cycles of PCR (denaturation 
at 94°C for 1 min; annealing at 55°C for 1 min; elongation at 72°C for 1 min) are 
performed with a Perkin-Elmer Cetus automated thermal cycler. The amplified product 

20 is analyzed by agarose gel electrophoresis and the DNA band with expected molecular 
weight is excised and purified. The PCR product is verified to be the selected sequence 
by subcloning and sequencing the DNA product. 

Several methods are available for the identification of the 5' or 3' non-coding 
portions of a gene which may not be present in the deposited clone. These methods 

25 include but are not limited to, filter probing, clone enrichment using specific probes, 
and protocols similar or identical to 5' and 3' "RACE" protocols which are well known 
in the art. For instance, a method similar to 5' RACE is available for generating the 
missing 5' end of a desired full-length transcript. (Fromont-Racine et al., Nucleic Acids 
Res. 21(7):1683-1684 (1993).) 

30 Briefly, a specific RNA oligonucleotide is ligated to the 5' ends of a population 

of RNA presumably containing full-length gene RNA transcripts. A primer set 
containing a primer specific to the ligated RNA oligonucleotide and a primer specific to 
a known sequence of the gene of interest is used to PCR amplify the 5" portion of the 
desired full-length gene. This amplified product may then be sequenced and used to 

35 generate the full length gene. 
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This above method starts with total RNA isolated from the desired source, 
although poly-A+ RNA can be used. The RNA preparation can then be treated with 
phosphatase if necessary to eliminate 5' phosphate groups on degraded or damaged 
RNA which may interfere with the later RNA ligase step. The phosphatase should then 
5 be inactivated and the RNA treated with tobacco acid pyrophosphatase in order to 
remove the cap structure present at the 5' ends of messenger RNAs. This reaction 
leaves a 5' phosphate group at the 5' end of the cap cleaved RNA which can then be 
ligated to an RNA oligonucleotide using T4 RNA ligase. 



10 synthesis using a gene specific oligonucleotide. The first strand synthesis reaction is 
used as a template for PCR amplification of the desired 5' end using a primer specific to 
the ligated RNA oligonucleotide and a primer specific to the known sequence of the 
gene of interest. The resultant product is then sequenced and analyzed to confirm that 
the 5' end sequence belongs to the desired gene. 



Example 2: Isolation of Genomic Clones Corresponding to a 
Polynucleotide 

A human genomic PI library (Genomic Systems, Inc.) is screened by PCR 
using primers selected for the cDNA sequence corresponding to SEQ ID NO:X., 
20 according to the method described in Example 1 . (See also, Sambrook.) 

Example 3: Tissue Distribution of Polypeptide 

Tissue distribution of mRNA expression of polynucleotides of the present 
invention is determined using protocols for Northern blot analysis, described by, 

25 among others, Sambrook et al. For example, a cDNA probe produced by the method 
described in Example 1 is labeled with P 32 using the rediprime™ DNA labeling system 
(Amersham Life Science), according to manufacturer's instructions. After labeling, the 
probe is purified using CHROMA SPIN- 100™ column (Clontech Laboratories, Inc.), 
according to manufacturer's protocol number PT 1200-1. The purified labeled probe is 

30 then used to examine various human tissues for mRNA expression. 

Multiple Tissue Northern (MTN) blots containing various human tissues (H) or 
human immune system tissues (IM) (Clontech) are examined with the labeled probe 
using ExpressHyb™ hybridization solution (Clontech) according to manufacturer's 
protocol number PT1 190-1. Following hybridization and washing, the blots are 

35 mounted and exposed to film at -70°C overnight, and the films developed according to 
standard procedures. 



This modified RNA preparation is used as a template for first strand cDNA 



15 
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Example 4: Chromosomal Mapping of the Polynucleotides 

An oligonucleotide primer set is designed according to the sequence at the 5' 
end of SEQ ID NO:X. This primer preferably spans about 100 nucleotides. This 
5 primer set is then used in a polymerase chain reaction under the following set of 

conditions : 30 seconds, 95°C; 1 minute, 56°C; 1 minute, 70°C. This cycle is repeated 
32 times followed by one 5 minute cycle at 70°C. Human, mouse, and hamster DNA 

is used as template in addition to a somatic cell hybrid panel containing individual 
chromosomes or chromosome fragments (Bios, Inc). The reactions is analyzed on 
10 either 8% polyacryl amide gels or 3.5 % agarose gels. Chromosome mapping is 

determined by the presence of an approximately 100 bp PCR fragment in the particular 
somatic cell hybrid. 

Example 5: Bacterial Expression of a Polypeptide 

15 A polynucleotide encoding a polypeptide of the present invention is amplified 

using PCR oligonucleotide primers corresponding to the 5' and 3' ends of the DNA 
sequence, as outlined in Example 1, to synthesize insertion fragments. The primers 
used to amplify the cDNA insert should preferably contain restriction sites, such as 
BamHI and Xbal, at the 5' end of the primers in order to clone the amplified product 

20 into the expression vector. For example, BamHI and Xbal correspond to the restriction 
enzyme sites on the bacterial expression vector pQE-9. (Qiagen, Inc., Chatsworth, 
CA). This plasmid vector encodes antibiotic resistance (Amp r ), a bacterial origin of 
replication (ori) ? an IPTG-reguIatable promoter/operator (P/O), a ribosome binding site 
(RBS), a 6-histidine tag (6-His), and restriction enzyme cloning sites. 

25 The pQE-9 vector is digested with BamHI and Xbal and the amplified fragment 

is ligated into the pQE-9 vector maintaining the reading frame initiated at the bacterial 
RBS. The ligation mixture is then used to transform the E. coli strain M15/rep4 
(Qiagen, Inc.) which contains multiple copies of the plasmid pREP4, which expresses 
the lad repressor and also confers kanamycin resistance (Kan r ). Transformants are 

30 identified by their ability to grow on LB plates and ampicillin/kanamycin resistant 

colonies are selected. Plasmid DNA is isolated and confirmed by restriction analysis. 

Clones containing the desired constructs are grown overnight (O/N) in liquid 
culture in LB media supplemented with both Amp (100 ug/ml) and Kan (25 ug/ml). 
The O/N culture is used to inoculate a large culture at a ratio of 1 : 100 to 1 :250. The 

35 cells are grown to an optical density 600 (O.D. 600 ) of between 0.4 and 0.6. IPTG 
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(Isopropyl-B-D-thiogalacto pyranoside) is then added to a final concentration of 1 m\l 
IPTG induces by inactivating the lad repressor, clearing the P/O leading to increased 
gene expression. 

Cells are grown for an extra 3 to 4 hours. Cells are then harvested by 
5 centnfugation (20 mins at 6000Xg). The cell pellet is solubilized in the chaotropic 

agent 6 Molar Guanidme HC1 by stirring for 3-4 hours at 4°C. The cell debris is 
removed by centrifugation, and the supernatant containing the polypeptide is loaded 
onto a nickel-nitrilo-tri-acetic acid ("Ni-NTA") affinity resin column (available from 
QIAGEN, Inc., supra). Proteins with a 6 x His tag bind to the Ni-NTA resin with high 

10 affinity and can be purified in a simple one-step procedure (for details see: The 
QIAexpressionist (1995) QIAGEN, Inc., supra). 

Briefly, the supernatant is loaded onto the column in 6 M guanidine-HCl, pH 8, 
the column is first washed with 10 volumes of 6 M guanidine-HCl, pH 8, then washed 
with 10 volumes of 6 M guanidine-HCl pH 6, and finally the polypeptide is eluted with 

15 6 M guanidine-HCl, pH 5. 

The purified protein is then renatured by dialyzing it against phosphate-buffered 
saline (PBS) or 50 mM Na-acctate, pH 6 buffer plus 200 mM NaCl. Alternatively, the 
protein can be successfully refolded while immobilized on the Ni-NTA column. The 
recommended conditions are as follows: renature using a linear 6M-1M urea gradient in 

20 500 mM NaCl, 20% glycerol, 20 mM Tris/HCl pH 7.4, containing protease inhibitors. 
The renaturation should be performed over a period of 1 .5 hours or more. After 
renaturation the proteins are eluted by the addition of 250 mM imrnidazole. lmmidazole 
is removed by a final dialyzing step against PBS or 50 mM sodium acetate pH 6 buffer 
plus 200 mM NaCl. The purified protein is stored at 4° C or frozen at -80° C. 

25 In addition to the above expression vector, the present invention further includes 

an expression vector comprising phage operator and promoter elements operatively 
linked to a polynucleotide of the present invention, called pHE4a. ( ATCC Accession 
Number 209645, deposited on February 25, 1998.) This vector contains: 1) a 
neomycinphosphotransferase gene as a selection marker, 2) an E. coli origin of 

30 replication, 3) a T5 phage promoter sequence, 4) two lac operator sequences, 5) a 

Shine-Delgarno sequence, and 6) the lactose operon repressor gene (laclq). The origin 
of replication (oriC) is derived from pUC19 (LTI, Gaithersburg, MD). The promoter 
sequence and operator sequences are made synthetically. 

DNA can be inserted into the pHEa by restricting the vector with Ndel and 

35 Xbal, BamHI, Xhol, or Asp718, running the restricted product on a gel, and isolating 
the larger fragment (the sniffer fragment should be about 310 base pairs). The DNA 
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insert is generated according to the PCR protocol described in Example L using PCR 
primers having restriction sites for Ndel (5* primer) and Xbal. BamHI, Xhol, or 
Asp71X (3' primer). The PCR insert is gel purified and restricted with compatible 
enzymes. The insert and vector are ligated according to standard protocols. 
5 The engineered vector could easily be substituted in the above protocol to 

express protein in a bacterial system. 

Example 6: Purification of a Polypeptide from an Inclusion Body 

The following alternative method can be used to purify a polypeptide expressed 
10 in £ coli when it is present in the form of inclusion bodies. Unless otherwise specified, 
all of the following steps are conducted at 4-10°C. 

Upon completion of the production phase of the E. coli fermentation, the cell 
culture is cooled to 4-10°C and the cells harvested by continuous centrifugation at 

15,000 rpm (Heraeus Sepatech). On the basis of the expected yield of protein per unit 
1 5 weight of cell paste and the amount of purified protein required, an appropriate amount 
of cell paste, by weight, is suspended in a buffer solution containing 100 mM Tris, 50 
mM EDTA, pH 7.4. The cells are dispersed to a homogeneous suspension using a 
high shear mixer. 

The cells are then lysed by passing the solution through a microfluidizer 
20 (Microfuidics, Corp. or APV Gaulin, Inc.) twice at 4000-6000 psi. The homogenate is 
then mixed with NaCI solution to a final concentration of 0.5 M NaCl, followed by 
centrifugation at 7000 xg for 15 min. The resultant pellet is washed again using 0.5M 
NaCl, 100 mM Tris, 50 mM EDTA, pH 7.4. 

The resulting washed inclusion bodies are solubilized with 1.5 M guanidine 
25 hydrochloride (GuHCl) for 2-4 hours. After 7000 xg centrifugation for 15 min., the 

pellet is discarded and the polypeptide containing supernatant is incubated at 4°C 
overnight to allow further GuHCl extraction. 

Following high speed centrifugation (30,000 xg) to remove insoluble particles, 
the GuHCl solubilized protein is refolded by quickly mixing the GuHCl extract with 20 
30 volumes of buffer containing 50 mM sodium, pH 4.5, 1 50 mM NaCl, 2 mM EDTA by 

vigorous stirring. The refolded diluted protein solution is kept at 4°C without mixing 
for 12 hours prior to further purification steps. 

To clarify the refolded polypeptide solution, a previously prepared tangential 
filtration unit equipped with 0. 1 6 |im membrane filter with appropriate surface area 
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(e.g., Filtron), equilibrated with 40 m\l sodium acetate, pH 6.0 is employed. The 
filtered sample is loaded onto a cation exchange resin (e.g., Poros HS-50, Perseptive 
Biosystems). The column is washed with 40 mM sodium acetate, pH 6.0 and eluted 
with 250 mM, 500 mM, 1000 mM, and 1500 mM NaCl in the same buffer, in a 
5 stepwise manner. The absorbance at 280 nm of the effluent is continuously monitored. 
Fractions are collected and further analyzed by SDS-PAGE. 

Fractions containing the polypeptide are then pooled and mixed with 4 volumes 
of water. The diluted sample is then loaded onto a previously prepared set of tandem 
columns of strong anion (Poros HQ-50, Perseptive Biosystems) and weak anion 
10 ( Poros CM-20, Perseptive Biosystems) exchange resins. The columns are equilibrated 
with 40 mM sodium acetate, pH 6.0. Both columns are washed with 40 mM sodium 
acetate, pH 6.0, 200 mM NaCl. The CM-20 column is then eluted using a 10 column 
volume linear gradient ranging from 0.2 M NaCl, 50 mM sodium acetate, pH 6.0 to 1.0 
M NaCl, 50 mM sodium acetate, pH 6.5. Fractions are collected under constant A, w 
15 monitoring of the effluent. Fractions containing the polypeptide (determined, for 
instance, by 16% SDS-PAGE) are then pooled. 

The resultant polypeptide should exhibit greater than 95% purity after the above 
refolding and purification steps. No major contaminant bands should be observed from 

Commassie blue stained 16% SDS-PAGE gel when 5 |ig of purified protein is loaded. 

20 The purified protein can also be tested for endotoxin/LPS contamination, and typically 
the LPS content is less than 0. 1 ng/ml according to LAL assays. 



Example 7: Cloning and Expression of a Polypeptide in a Baculovirus 
Expression System 

25 In this example, the plasmid shuttle vector pA2 is used to insert a polynucleotide 

into a baculovirus to express a polypeptide. This expression vector contains the strong 
polyhedrin promoter of the Autographa californica nuclear polyhedrosis virus 
( AcMNPV) followed by convenient restriction sites such as BamHI, Xba I and 
Asp718. The polyadenylation site of the simian virus 40 ("SV40") is used for efficient 

30 polyadenylation. For easy selection of recombinant virus, the plasmid contains the 

beta-galactosidase gene from E. coli under control of a weak Drosophila promoter in the 
same orientation, followed by the polyadenylation signal of the polyhedrin gene. The 
inserted genes are flanked on both sides by viral sequences for cell-mediated 
homologous recombination with wild-type viral DNA to generate a viable virus that 

35 express the cloned polynucleotide. 
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Many other baculovirus vectors can be used in place of the vector above, such 
as pAe373, pVL941, and pAcIMl, as one skilled in the an would readily appreciate, as 
long as the construct provides appropriately located signals for transcription, 
translation, secretion and the like, including a signal peptide and an in-frame AUG as 
5 required. Such vectors are described, for instance, in Luckow et al., Virology 170:31- 
39 (1989). 

Specifically, the cDNA sequence contained in the deposited clone, including the 
AUG initiation codon and the naturally associated leader sequence identified in Table 1, 
is amplified using the PCR protocol described in Example 1. If the naturally occurring 

10 signal sequence is used to produce the secreted protein, the pA2 vector does not need a 
second signal peptide. Alternatively, the vector can be modified (pA2 GP) to include a 
baculovirus leader sequence, using the standard methods described in Summers et al., 
"A Manual of Methods for Baculovirus Vectors and Insect Cell Culture Procedures," 
Texas Agricultural Experimental Station Bulletin No. 1555 (1987). 

l5 The amplified fragment is isolated from a 1 % agarose gel using a commercially 

available kit ("Geneclean," BIO 101 Inc., La Jolla, Ca.)- The fragment then is digested 
with appropriate restriction enzymes and again purified on a 1% agarose gel. 

The plasmid is digested with the corresponding restriction enzymes and 
optionally, can be dephosphorylated using calf intestinal phosphatase, using routine 

20 procedures known in the art. The DNA is then isolated from a 1% agarose gel using a 
commercially available kit ("Geneclean" BIO 101 Inc., La Jolla, Ca.). 

The fragment and the dephosphorylated plasmid are ligated together with T4 
DNA ligase. E. coli HB101 or other suitable £. coli hosts such as XL-1 Blue 
(Stratagene Cloning Systems, La Jolla, CA) cells are transformed with the ligation 

25 mixture and spread on culture plates. Bacteria containing the plasmid are identified by 
digesting DNA from individual colonies and analyzing the digestion product by gel 
electrophoresis. The sequence of the cloned fragment is confirmed by DNA 
sequencing. 

Five ^g of a plasmid containing the polynucleotide is co-transfected with 1 .0 \ig 
30 of a commercially available linearized baculovirus DNA ("BaculoGold™ baculovirus 
DNA", Pharmingen, San Diego, CA), using the lipofection method described by 
Feigner et al., Proc. Natl. Acad. Sci. USA 84:7413-7417 (1987). One \ig of 
BaculoGold™ virus DNA and 5 |u.g of the plasmid are mixed in a sterile well of a 
microtiter plate containing 50 \i\ of serum-free Grace's medium (Life Technologies 
35 Inc., Gaithersburg, MD). Afterwards, 10 pj Lipofectin plus 90 \i\ Grace's medium are 
added, mixed and incubated for 15 minutes at room temperature. Then the transfection 
mixture is added drop-wise to Sf9 insect cells (ATCC CRL 1711) seeded in a 35 mm 
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tissue culture plate with 1 ml Grace's medium without serum. The plate is then 
incubated for 5 hours at 27° C. The transaction solution is then removed from the plate 
and 1 ml of Grace's insect medium supplemented with 10% fetal calf serum is added. 
Cultivation is then continued at 27° C for four days. 
5 After four days the supernatant is collected and a plaque assay is performed, as 

described by Summers and Smith, supra. An agarose gel with "Blue Gal" (Life 
Technologies Inc., Gaithersburg ) is used to allow easy identification and isolation of 
gal-expressing clones, which produce blue-stained plaques. (A detailed description of a 
"plaque assay" of this type can also be found in the user's guide for insect cell culture 

10 and baculovirology distributed by Life Technologies Inc., Gaithersburg, page 9-10.) 
After appropriate incubation, blue stained plaques are picked with the tip of a 
micropipettor (e.g., Eppendorf). The agar containing the recombinant viruses is then 
resuspended in a microcentrifuge tube containing 200 ui of Grace's medium and the 
suspension containing the recombinant baculovirus is used to infect Sf9 cells seeded in 

15 35 mm dishes. Four days later the supernatants of these culture dishes are harvested 
and then they are stored at 4° C. 

To verify the expression of the polypeptide, Sf9 cells are grown in Grace's 

medium supplemented with 10% heat-inactivated FBS. The cells are infected with the 

recombinant baculovirus containing the polynucleotide at a multiplicity of infection 

20 ("MOI") of about 2. If radiolabeled proteins are desired, 6 hours later the medium is 
removed and is replaced with SF900 II medium minus methionine and cysteine 
(available from Life Technologies Inc., Rockville, MD). After 42 hours, 5 ixCi of 35 S- 
methionine and 5 [id 35 S-cysteine (available from Amersham) are added. The cells are 
further incubated for 16 hours and then are harvested by centrifugation. The proteins 

25 in the supernatant as well as the intracellular proteins are analyzed by SDS-PAGE 
followed by autoradiography (if radiolabeled). 

Microsequencing of the amino acid sequence of the amino terminus of purified 
protein may be used to determine the amino terminal sequence of the produced 
protein. 



30 



Example 8: Expression of a Polypeptide in Mammalian Cells 

The polypeptide of the present invention can be expressed in a mammalian cell. 
A typical mammalian expression vector contains a promoter element, which mediates 
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the initiation of transcription of mRNA. a protein coding sequence, and signals required 
for the termination of transcription and polyadenylaiion of the transcript. Additional 
elements include enhancers. Kozak sequences and intervening sequences flanked by 
donor and acceptor sites for RNA splicing. Highly efficient transcription is achieved 
5 with the early and late promoters from SV40, the long terminal repeats (LTRs) from 
Retroviruses, e.g., RSV, HTLV1, HIVI and the early promoter of the cytomegalovirus 
(CMV). However, cellular elements can also be used (e.g., the human actin promoter). 

Suitable expression vectors for use in practicing the present invention include, 
for example, vectors such as pSVL and pMSG (Pharmacia. Uppsala. Sweden), 
10 pRSVcat (ATCC 37152), pSV2dhfr (ATCC 37146), pBC12MI (ATCC 67109), 
pCMVSport 2.0, and pCMVSport 3.0. Mammalian host cells that could be used 
include, human Hela, 293, H9 and Jurkat cells, mouse NIH3T3 and CI 27 cells, Cos 1, 
Cos 7 and CV1, quail QC1-3 cells, mouse L cells and Chinese hamster ovary (CHO) 
cells. 

15 Alternatively, the polypeptide can be expressed in stable cell lines containing the 

polynucleotide integrated into a chromosome. The co-transfection with a selectable 
marker such as dhfr, gpt, neomycin, hygromycin allows the identification and isolation 
of the transfected cells. 

The transfected gene can also be amplified to express large amounts of the 

20 encoded protein. The DHFR (dihydrofolate reductase) marker is useful in developing 
cell lines that carry several hundred or even several thousand copies of the gene of 
interest. (See, e.g., Alt, F. W., et al., J. Biol. Chem. 253:1357-1370 (1978); Hamlin, 
J. L. and Ma, C, Biochem. et Biophys. Acta, 1097:107-143 (1990); Page, M. J. and 
Sydenham, M. A., Biotechnology 9:64-68 (1991.).) Another useful selection marker is 

25 the enzyme glutamine synthase (GS) (Murphy et al., Biochem J. 227:277-279 ( 1 99 1 ); 
Bebbington et al., Bio/Technology 10:169-175 (1992). Using these markers, the 
mammalian cells are grown in selective medium and the cells with the highest resistance 
are selected. These cell lines contain the amplified gene(s) integrated into a 
chromosome. Chinese hamster ovary (CHO) and NSO ceils are often used for the 

30 production of proteins. 

Derivatives of the plasmid pSV2-dhfr (ATCC Accession No. 37146), the 
expression vectors pC4 (ATCC Accession No. 209646) and pC6 (ATCC Accession 
No.209647) contain the strong promoter (LTR) of the Rous Sarcoma Virus (Cullen et 
al., Molecular and Cellular Biology, 438-447 (March, 1985)) plus a fragment of the 

35 CMV-enhancer (Boshart et al.. Cell 41:521-530 (1985).) Multiple cloning sites, e.g., 
with the restriction enzyme cleavage sites BamHI, Xbal and Asp718, facilitate the 
cloning of the gene of interest. The vectors also contain the 3' intron, the 



WO 98 54963 T I 

235 



polyadenylation and termination signal of the rat prcpromsulin gene, and the mouse 
DHFR gene under control of the SV40 early promoter. 

Specifically, the plasmid pC6, for example, is digested with appropriate 
restriction enzymes and then dephosphorylated using calf intestinal phosphates by 
5 procedures known in the art. The vector is then isolated from a 1 % agarose gel. 

A polynucleotide of the present invention is amplified according to the protocol 
outlined in Example 1. If the naturally occurring signal sequence is used to produce the 
secreted protein, the vector does not need a second signal peptide. Alternatively, if the 
naturally occurring signal sequence is not used, the vector can be modified to include a 
10 heterologous signal sequence. (See, e.g., WO 96/34891.) 

The amplified fragment is isolated from a 1% agarose gel using a commercially 
available kit ( M Geneclean, M BIO 101 Inc., La Jolla, Ca.). The fragment then is digested 
with appropriate restriction enzymes and again purified on a \% agarose gel. 

The amplified fragment is then digested with the same restriction enzyme and 
15 purified on a \% agarose gel. The isolated fragment and the dephosphorylated vector 
are then ligated with T4 DNA ligase. £. coli HB101 or XL-1 Blue cells are then 
transformed and bacteria are identified that contain the fragment inserted into plasmid 
pC6 using, for instance, restriction enzyme analysis. 

Chinese hamster ovary cells lacking an active DHFR gene is used for 
20 transfection. Five (ig of the expression plasmid pC6 is cotransfected with 0.5 tig of the 
plasmid pSVneo using lipofectin (Feigner et al., supra). The plasmid pSV2-neo 
contains a dominant selectable marker, the neo gene from Tn5 encoding an enzyme that 
confers resistance to a group of antibiotics including G418. The cells are seeded in 
alpha minus MEM supplemented with 1 mg/ml G418. After 2 days, the cells are 
25 trypsinized and seeded in hybridoma cloning plates (Greiner, Germany) in alpha minus 
MEM supplemented with 10, 25, or 50 ng/ml of metothrexate plus 1 mg/ml G418. 
After about 10-14 days single clones are trypsinized and then seeded in 6-well petri 
dishes or 10 ml flasks using different concentrations of methotrexate (50 nM, 100 nM, 
200 nM, 400 nM, 800 nM). Clones growing at the highest concentrations of 
30 methotrexate are then transferred to new 6-well plates containing even higher 

concentrations of methotrexate (1 p_M, 2 |iM, 5 \iM, iO mM, 20 mM). The same 
procedure is repeated until clones are obtained which grow at a concentration of 100 - 
200 fiM. Expression of the desired gene product is analyzed, for instance, by SDS- 
PAGE and Western blot or by reversed phase HPLC analysis. 



35 
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Example 9: Protein Fusions 

The polypeptides of the present invention are preferably fused to other proteins. 
These fusion proteins can be used for a variety of applications. For example, fusion of 
the present polypeptides to His-tag, HA-tag, protein A, IgG domains, and maltose 
5 binding protein facilitates purification. (See Example 5; see also EP A 394,827; 

Traunecker, et al., Nature 331:84-86 (1988).) Similarly, fusion to IgG-1, IgG-3, and 
albumin increases the halflife time in vivo. Nuclear localization signals fused to the 
polypeptides of the present invention can target the protein to a specific subcellular 
localization, while covalent heterodimer or homodimers can increase or decrease the 

10 activity of a fusion protein. Fusion proteins can also create chimeric molecules having 
more than one function. Finally, fusion proteins can increase solubility and/or stability 
of the fused protein compared to the non-fused protein. All of the types of fusion 
proteins described above can be made by modifying the following protocol, which 
outlines the fusion of a polypeptide to an IgG molecule, or the protocol described in 

15 Example5. 

Briefly, the human Fc portion of the IgG molecule can be PCR amplified, using 
primers that span the 5' and 3' ends of the sequence described below. These primers 
also should have convenient restriction enzyme sites that will facilitate cloning into an 
expression vector, preferably a mammalian expression vector. 

20 For example, if pC4 (Accession No. 209646) is used, the human Fc portion can 

be ligated into the BamHI cloning site. Note that the 3' BamHI site should be 
destroyed. Next, the vector containing the human Fc portion is re-restricted with 
BamHI, linearizing the vector, and a polynucleotide of the present invention, isolated 
by the PCR protocol described in Example 1, is ligated into this BamHI site. Note that 

25 the polynucleotide is cloned without a stop codon, otherwise a fusion protein will not 
be produced. 

If the naturally occurring signal sequence is used to produce the secreted 
protein, pC4 does not need a second signal peptide. Alternatively, if the naturally 
occurring signal sequence is not used, the vector can be modified to include a 
30 heterologous signal sequence. (See, e.g., WO 96/34891.) 

Human IgG Fc region: 

GGGATCCGGAGCCCAAATCTTCTGACAAAACTCACACATGCCCACCGTGCC 
CAGCACCTGAATTCGAGGGTGCACCGTCAGTCTTCCTCTTCCCCCCAAAACC 
35 CAAGGACACCCTCATGATCTCCCGGACTCCTGAGGTCACATGCGTGGTGGT 
GGACGTAAGCCACGAAGACCCTGAGGTCAAGTTCAACTGGTACGTGGACG 
GCGTGGAGGTGCATAATGCCAAGACAAAGCCGCGGGAGGAGCAGTACAAC 
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AGCACGTACCGTGTGGTCAGCGTCCTCACCGTCCTGCACCAGGACTGGCTG 
AATGGCAAGGAGTACAAGTGCAAGGTCTCCAACAAAGCCCTCCCAACCCCC 
ATCGAGAA.AACCATCTCCAAAGCCAAAGGGCAGCCCCGAGAACCACAGGT 
GTACACCCTGCCCCCATCCCGGGATGAGCTGACCAAGAACCAGGTCAGCCT 
5 GACCTGCCTGGTCAAAGGCTTCTATCCAAGCGACATCGCCGTGGAGTGGGA 
GAGCAATGGGCAGCCGGAGAACAACTACAAGACCACGCCTCCCGTGCTGG 
ACTCCGACGGCTCCTTCTTCCTCTACAGCAAGCTCACCGTGGACAAGAGCA 
GGTGGCAGCAGGGGAACGTCTTCTCATGCTCCGTGATGCATGAGGCTCTGC 
ACAACCACTACACGCAGAAGAGCCTCTCCCTGTCTCCGGGTAAATGAGTGC 
1 0 G ACGGCCGCG ACTCTAGAGGAT (SEQ ID NO: 1 ) 

Example 10: Production of an Antibody from a Polypep tide 

The antibodies of the present invention can be prepared by a variety of methods. 
(See, Current Protocols, Chapter 2.) For example, cells expressing a polypeptide of 

1 5 the present invention is administered to an animal to induce the production of sera 

containing polyclonal antibodies. In a preferred method, a preparation of the secreted 
protein is prepared and purified to render it substantially free of natural contaminants. 
Such a preparation is then introduced into an animal in order to produce polyclonal 
antisera of greater specific activity. 

20 In the most preferred method, the antibodies of the present invention are 

monoclonal antibodies (or protein binding fragments thereof)- Such monoclonal 
antibodies can be prepared using hybridoma technology. (Kohler et al., Nature 
256:495 (1975); Kohler et al., Eur. J. Immunol. 6:51 1 (1976); Kohler et al., Eur. J. 
Immunol. 6:292 (1976); Hammerling et al., in: Monoclonal Antibodies and T-Cell 

25 Hybridomas, Elsevier, N.Y., pp. 563-681 (1981).) In general, such procedures 
involve immunizing an animal (preferably a mouse) with polypeptide or, more 
preferably, with a secreted polypeptide-expressing cell. Such cells may be cultured in 
any suitable tissue culture medium; however, it is preferable to culture cells in Earle's 
modified Eagle's medium supplemented with 10% fetal bovine serum (inactivated at 

30 about 56°C), and supplemented with about 10 g/1 of nonessential amino acids, about 
1,000 U/ml of penicillin, and about 100 Jig/ml of streptomycin. 

The splenocytes of such mice are extracted and fused with a suitable myeloma 
cell line. Any suitable myeloma cell line may be employed in accordance with the 
present invention; however, it is preferable to employ the parent myeloma cell line 

35 f SP20), available from the ATCC. After fusion, the resulting hybridoma cells are 
selectively maintained in HAT medium, and then cloned by limiting dilution as 
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described by Wands et al. (Gastroenterology 80:225-232 (1981).) The hybndoma cells 
obtained through such a selection are then assayed to identify clones which secrete 
antibodies capable of binding the polypeptide. 

Alternatively, additional antibodies capable of binding to the polypeptide can be 
5 produced in a two-step procedure using anti-idiotypic antibodies. Such a method 
makes use of the fact that antibodies are themselves antigens, and therefore, it is 
possible to obtain an antibody which binds to a second antibody. In accordance with 
this method, protein specific antibodies are used to immunize an animal, preferably a 
mouse. The splenocytes of such an animal are then used to produce hybndoma cells, 
10 and the hybridoma cells are screened to identify clones which produce an antibody 

whose ability to bind to the protein-specific antibody can be blocked by the polypeptide. 
Such antibodies comprise anti-idiotypic antibodies to the protein-specific antibody and 
can be used to immunize an animal to induce formation of further protein-specific 
antibodies. 

15 It will be appreciated that Fab and F(ab')2 and other fragments of the antibodies 

of the present invention may be used according to the methods disclosed herein. Such 
fragments are typically produced by proteolytic cleavage, using enzymes such as papain 
(to produce Fab fragments) or pepsin (to produce F(ab')2 fragments). Alternatively, 
secreted protein-binding fragments can be produced through the application of 

20 recombinant DNA technology or through synthetic chemistry. 

For in vivo use of antibodies in humans, it may be preferable to use 
"humanized" chimeric monoclonal antibodies. Such antibodies can be produced using 
genetic constructs derived from hybridoma cells producing the monoclonal antibodies 
described above. Methods for producing chimeric antibodies are known in the art. 

25 (See, for review, Morrison, Science 229:1202 (1985); Oi et al., BioTechniqucs 4:214 
(1986); Cabilly et al., U.S. Patent No. 4,816,567; Taniguchi et al., EP 171496; 
Morrison et al., EP 173494; Neuberger et al., WO 8601533; Robinson et al., WO 
870267 1 ; Boulianne et al., Nature 312:643 (1984); Neuberger et al., Nature 314:268 
(1985).) 

30 

Example 11: Production Of Secreted Protein For High-Throughp ut 
Screening Assays 

The following protocol produces a supernatant containing a polypeptide to be 
tested. This supernatant can then be used in the Screening Assays described in 
35 Examples 13-20. 

First, dilute Poly-D-Lysine (644 587 Boehringer-Mannheim) stock solution 
(Img/ml in PBS) 1:20 in PBS (w/o calcium or magnesium 17-5 16F Biowhittaker) for a 
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working solution of 50ug/ml. Add 200 ul of this solution to each well (24 well plates) 
and incubate at RT for 20 minutes. Be sure to distribute the solution over each well 
(note: a 12-channel pipetter may be used with tips on every other channel). Aspirate off 
the Poly-D-Lysine solution and rinse with 1ml PBS (Phosphate Buffered Saline). The 
5 PBS should remain in the well until just prior to plating the cells and plates may be 
poly-lysine coated in advance for up to two weeks. 

Plate 293T cells (do not carry cells past P+20) at 2 x 10 s cells/well in .5ml 
DMEM(Dulbecco's Modified Eagle Medium)(with 4.5 G/L glucose and L-glutamine 
(12-604F Biowhittaker))/10% heat inactivated FBS( 14-503F BiowhittakerVlx 
1 0 Penstrep( 1 7-602E Biowhittaker). Let the cells grow overnight. 

The next day, mix together in a sterile solution basin: 300 ul Lipofectarrune 
(18324-012 Gibco/BRL) and 5ml Optimem I (31985070 Gibco/BRL)/96- well plate. 
With a small volume multi-channel pipetter, aliquot approximately 2ug of an expression 
vector containing a polynucleotide insert, produced by the methods described in 
1 5 Examples 8 or 9, into an appropriately labeled 96-wcll round bottom plate. With a 
multi-channel pipetter, add 50ul of the Lipofectamine/Optimem I mixture to each well. 
Pipette up and down gently to mix. Incubate at RT 15-45 minutes. After about 20 
minutes, use a multi-channel pipetter to add 150ul Optimem 1 to each well. As a 
control, one plate of vector DNA lacking an insert should be transfected with each set of 
20 transfections. 

Preferably, the transfection should be performed by tag-teaming the following 
tasks. By tag-teaming, hands on time is cut in half, and the cells do not spend too 
much time on PBS. First, person A aspirates off the media from four 24-well plates of 
cells, and then person B rinses each well with .5- lml PBS. Person A then aspirates off 
25 PBS rinse, and person B, using a 12-channel pipetter with tips on every other channel, 
adds the 200ul of DNA/Lipofectamine/Optimem I complex to the odd wells first, then to 
the even wells, to each row on the 24-well plates. Incubate at 37°C for 6 hours. 

While cells are incubating, prepare appropriate media, either 1%BSA in DMEM 
with Ix penstrep, or CHO-5 media (1 16.6 mg/L of CaC12 (anhyd); 0.00130 mg/L 

30 CuS0 4 -5H 2 0; 0.050 mg/L of Fe(NCg r 9H 2 0; 0.417 mg/L of FeS0 4 -7H 2 0; 31 1.80 
mg/L of Kcl; 28.64 mg/L of MgCl 2 ; 48.84 mg/L of MgS0 4 ; 6995.50 mg/L of NaCl; 
2400.0 mg/L of NaHCO,; 62.50 mg/L of NaH 2 PO 4 -H 2 0; 7 1 .02 mg/L of Na 2 HP04; 
.4320 mg/L of ZnS0 4 -7H 2 0; .002 mg/L of Arachidonic Acid ; 1 .022 mg/L of 
Cholesterol; .070 mg/L of DL-alpha-Tocopherol- Acetate; 0.0520 mg/L of Linoleic 

35 Acid; 0.010 mg/L of Linolenic Acid; 0.010 mg/L of Myristic Acid; 0.010 mg/L of Oleic 
Acid; 0.010 mg/L of Palmitric Acid; 0.010 mg/L of Palmitic Acid; 100 mg/L of 
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Piuronic F-68; 0.01 0 mg/L of Stearic Acid; 2.20 mg/L of Tween 80; 455 1 mg/L of D- 
Glucosc; 130.85 mg/ml of L- Alanine; 147.50 mg/ml of L-Arginine-HCL; 7.50 mg/ml 
of L-Asparagine-H 2 0; 6.65 mg/ml of L-Aspartic Acid; 29.56 mg/ml of L-Cystine- 
2HCL-H,0; 31.29 mg/ml of L-Cystine-2HCL; 7.35 mg/ml of L-Glutamic Acid; 365.0 
5 mg/ml of L-Glutamine; 18.75 mg/ml of Glycine; 52.48 mg/ml of L-Histidine-HCL- 
H : 0; 106.97 mg/ml of L-Isoleucine; 1 1 1.45 mg/ml of L-Leucine; 163.75 mg/ml ofL- 
Lysine HCL; 32.34 mg/ml of L-Methionine; 68.48 mg/ml of L-Phenylalainine; 40.0 
mg/ml of L-Proline; 26.25 mg/ml of L-Serine; 101.05 mg/ml of L-Threomne; 19.22 
mg/ml of L-Tryptophan; 91.79 mg/ml of L-Tryrosine-2Na-2H,0; 99.65 mg/ml of L- 

10 Valine; 0.0035 mg/L of Biotin; 3.24 mg/L of D-Ca Pantothenate; 1 1 .78 mg/L of 
Choline Chloride; 4.65 mg/L of Folic Acid; 15.60 mg/L of i-Inositol; 3.02 mg/L of 
Niacinamide; 3.00 mg/L of Pyridoxal HCL; 0.031 mg/L of Pyridoxine HCL; 0.319 
mg/L of Riboflavin; 3.17 mg/L of Thiamine HCL; 0.365 mg/L of Thymidine; and 
0.680 mg/L of Vitamin B J2 ; 25 mM of HEPES Buffer; 2.39 mg/L of Na Hypoxanthine; 

15 0.105 mg/L of Lipoic Acid; 0.081 mg/L of Sodium Putrescine-2HCL; 55.0 mg/L of 
Sodium Pyruvate; 0.0067 mg/L of Sodium Selenite; 20uM of Ethanolamine; 0.122 
mg/L of Ferric Citrate; 41.70 mg/L of Methyl-B-Cyclodextrin complexed with Linoleic 
Acid; 33.33 mg/L of Methyl-B-Cyclodextrin complexed with Oleic Acid; and 10 mg/L 
of Methyl-B-Cyclodextrin complexed with Retinal) with 2mm glutamine and lx 

20 penstrep. (BSA (81-068-3 Bayer) lOOgm dissolved in 1L DMEM for a 10% BSA stock 
solution). Filter the media and collect 50 ul for endotoxin assay in 15ml polystyrene 
conical. 

The transfection reaction is terminated, preferably by tag-teaming, at the end of 
the incubation period. Person A aspirates off the transfection media, while person B 

25 adds 1.5ml appropriate media to each well. Incubate at 37°C for 45 or 72 hours 
depending on the media used: 1 %BSA for 45 hours or CHO-5 for 72 hours. 

On day four, using a 300ul multichannel pipetter, aliquot 600ul in one 1ml deep 
well plate and the remaining supernatant into a 2ml deep well. The supernatants from 
each well can then be used in the assays described in Examples 13-20. 

30 It is specifically understood that when activity is obtained in any of the assays 

described below using a supernatant, the activity originates from either the polypeptide 
directly (e.g., as a secreted protein) or by the polypeptide inducing expression of other 
proteins, which are then secreted into the supernatant. Thus, the invention further 
provides a method of identifying the protein in the supernatant characterized by an 

35 activity in a particular assay. 
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Example 12: Construction of GAS Reporter Construct 

One signal transduction pathway involved in the differentiation and proliferation 
of cells is called the Jaks-STATs pathway. Activated proteins in the Jaks-STATs 
pathway bind to gamma activation site "GAS" elements or interferon-sensitive 
5 responsive element ("ERE"), located in the promoter of many genes. The binding of a 
protein to these elements alter the expression of the associated gene. 

GAS and 1SRE elements are recognized by a class of transcription factors called 
Signal Transducers and Activators of Transcription, or "STATs." There are six 
members of the STATs family. Stall and Stat3 are present in many cell types, as is 

10 Stat2 (as response to IFN-alpha is widespread). Stat4 is more restricted and is not in 
many cell types though it has been found in T helper class I, cells after treatment with 
IL-12. Stat5 was originally called mammary growth factor, but has been found at 
higher concentrations in other cells including myeloid cells. It can be activated in tissue 
culture cells by many cytokines. 

15 The STATs are activated to translocate from the cytoplasm to the nucleus upon 

tyrosine phosphorylation by a set of kinases known as the Janus Kinase ("Jaks") 
family. Jaks represent a distinct family of soluble tyrosine kinases and include Tyk2, 
Jakl, Jak2, and Jak3. These kinases display significant sequence similarity and are 
generally catalytically inactive in resting cells. 

20 The Jaks are activated by a wide range of receptors summarized in the Table 

below. (Adapted from review by Schidler and Darnell, Ann. Rev. Biochem. 64:621-51 
(1995).) A cytokine receptor family, capable of activating Jaks, is divided into two 
groups: (a) Class 1 includes receptors for IL-2, IL-3, IL-4, IL-6, IL-7, IL-9, IL-1 1, IL- 
12, IL-15, Epo, PRL, GH, G-CSF, GM-CSF, LIF, CNTF, and thrombopoietin; and 

25 (b) Class 2 includes IFN-a, IFN-g, and IL-10. The Class 1 receptors share a 

conserved cysteine motif (a set of four conserved cysteines and one tryptophan) and a 
WSX WS motif (a membrane proxial region encoding Trp-Ser-Xxx-Trp-Ser (SEQ ID 
NO:2)). 

Thus, on binding of a ligand to a receptor, Jaks are activated, which in turn 
30 activate STATs, which then translocate and bind to GAS elements. This entire process 
is encompassed in the Jaks-STATs signal transduction pathway. 

Therefore, activation of the Jaks-STATs pathway, reflected by the binding of 
the GAS or the ISRE element, can be used to indicate proteins involved in the 
proliferation and differentiation of cells. For example, growth factors and cytokines are 
35 known to activate the Jaks-STATs pathway. (See Table below.) Thus, by using GAS 
elements linked to reporter molecules, activators of the Jaks-STATs pathway can be 
identified. 
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I o construct a synthetic GAS containing promoter element, which is used in the 
Biological Assays described in Examples 13-14, a PCR based strategy is employed to 
generate a GAS-SV40 promoter sequence. The 5 1 primer contains four tandem copies 
of the GAS binding site found in the IRFI promoter and previously demonstrated to 
5 bind STATs upon induction with a range of cytokines (Rothman et al.. Immunity 

1:457-468 (1994).), although other GAS or 1SRE elements can be used instead. The 5 1 
primer also contains 18bp of sequence complementary to the SV40 early promoter 
sequence and is flanked with an Xhol site. The sequence of the 5' primer is: 
5^GCGCCTCGAGATTTCCCCGAAATCTAGATTTCCCCGAAATGATTTCCCCG 
10 AAATG AT1TCCCCG AAAT ATCTGCC ATCTC A ATT AG : 3 1 (SEQ ID NO:3) 

The downstream primer is complementary to the SV40 promoter and is flanked 
with a Hind III site: 5 ' :GCGGC AAGCTTTTTGC AA AGCCT AGGC: 3 ' (SEQ ID 
NO:4) 

PCR amplification is performed using the SV40 promoter template present in 
15 the B-gal:promoter plasmid obtained from Clontcch. The resulting PCR fragment is 
digested with Xhol/Hind III and subcloned into BLSK2-. (Stratagene.) Sequencing 
with forward and reverse primers confirms that the insert contains the following 
sequence: 

5:CTCGAGATTTCCCCGAAATCTAGATTTCCCCG 

20 ATTTCCCCGAAATATCTGCCATCTCAATTAGTCAGCAACCATAGTCCCGCCC 
CTAACTCCGCCCATCCCGCCCCTAACTCCGCCCAGTTCCGCCCATTCTCCGC 
CCCATGGCTGACTAATTTTTTTTATTTATGCAGAGGCCGAGGCCGCCTCGGC 
CTCTGAGCTATTCCAGAAGTAGTGAGGAGGCTTTTTTGGAGGCCTAGGCTTT 
TGCAAA AAGCTT: 3 ' (SEQ ID NO:5) 

25 With this GAS promoter element linked to the SV40 promoter, a GAS:SEAP2 

reporter construct is next engineered. Here, the reporter molecule is a secreted alkaline 
phosphatase, or "SEAP." Clearly, however, any reporter molecule can be instead of 
SEAP, in this or in any of the other Examples. Well known reporter molecules that can 
be used instead of SEAP include chloramphenicol acetyltransferase (CAT), luciferase, 

30 alkaline phosphatase, B-galactosidase, green fluorescent protein (GFP), or any protein 
detectable by an antibody. 

The above sequence confirmed synthetic GAS-S V40 promoter element is 
subcloned into the pSEAP-Promoter vector obtained from Clontech using Hindlll and 
Xhol, effectively replacing the SV40 promoter with the amplified GAS:SV40 promoter 

35 element, to create the GAS-SEAP vector. However, this vector does not contain a 
neomycin resistance gene, and therefore, is not preferred for mammalian expression 
systems. 
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Thus, in order to generate mammalian stable cell lines expressing the GAS- 
SEAP reporter, the GAS-SEAP cassette is removed from the GAS-SEAP vector using 
Sail and Notl. and inserted into a backbone vector containing the neomycin resistance 
gene, such as pGFP-1 (Clontechh using these restriction sites in the multiple cloning 
5 site, to create the GAS-SEAP/Neo vector. Once this vector is transacted into 

mammalian cells, this vector can then be used as a reporter molecule for GAS binding 
as described in Examples 13-14. 

Other constructs can be made using the above description and replacing GAS 
with a different promoter sequence. For example, construction of reporter molecules 

10 containing NFK-B and EGR promoter sequences are described in Examples 15 and 16. 
However, many other promoters can be substituted using the protocols described in 
these Examples. For instance, SRE, IL-2, NFAT, or Osteocalcin promoters can be 
substituted, alone or in combination (e.g., GAS/NF-KB/EGR, GAS/NF-KB, II- 
2/NFAT, or NF-KB/GAS ). Similarly, other cell lines can be used to test reporter 

1 5 construct activity, such as HELA (epithelial), HU VEC (endothelial), Reh (B-cell), 
Saos-2 (osteoblast), HUVAC (aortic), or Cardiomyocyte. 

Example 13: High-Throughput Screening Assay for T-cell Activity. 

The following protocol is used to assess T-cell activity by identifying factors, 

20 such as growth factors and cytokines, that may proliferate or differentiate T-cells. T- 
cell activity is assessed using the GAS/SEAP/Neo construct produced in Example 12. 
Thus, factors that increase SEAP activity indicate the ability to activate the Jaks-STATS 
signal transduction pathway. The T-celi used in this assay is Jurkat T-cells (ATCC 
Accession No. TIB- 152), although Molt-3 cells (ATCC Accession No. CRL-1552) and 

25 Molt-4 cells (ATCC Accession No. CRL-1582) cells can also be used. 

Jurkat T-cells are lymphoblastic CD4+ Thl helper cells. In order to generate 
stable cell lines, approximately 2 million Jurkat cells are transfected with the GAS- 
SEAP/neo vector using DMRIE-C (Life Technologies)(transfection procedure 
described below). The transfected cells are seeded to a density of approximately 

30 20,000 cells per well and transfectants resistant to 1 mg/ml genticin selected. Resistant 
colonies are expanded and then tested for their response to increasing concentrations of 
interferon gamma. The dose response of a selected clone is demonstrated. 

Specifically, the following protocol will yield sufficient cells for 75 wells 
containing 200 ul of cells. Thus, it is either scaled up, or performed in multiple to 

35 generate sufficient cells for multiple 96 well plates. Jurkat cells are maintained in RPM1 
+ 10% serum with l%Pen-Strcp. Combine 2.5 mis of OPTI-MEM (Life Technologies) 
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with 10 ug of plasmid DNA in a T25 flask. Add 2.5 ml OPTI-MEM containing 50 ul 

of DMREE-C and incubate at room temperature for 15-45 mins. 

During the incubation period, count cell concentration, spin down the required 

number of cells (10 7 per transfection), and resuspend in OPTI-MEM to a final 
5 concentration of 10 7 cells/ml. Then add 1ml of I x 10 7 cells in OPTI-MEM to T25 flask 

and incubate at 37 C C for 6 hrs. After the incubation, add 10 ml of RPMI + 15% scrum. 
The Jurkat:GAS-SEAP stable reporter lines are maintained in RPMI + 10% 

serum, 1 mg/ml Genticin, and 1% Pen-Strep. These cells are treated with supernatants 

containing a polypeptide as produced by the protocol described in Example 1 1 . 
10 On the day of treatment with the supernatant, the cells should be washed and 

resuspended in fresh RPMI + 10% serum to a density of 500,000 cells per ml. The 

exact number of cells required will depend on the number of supernatants being 

screened. For one 96 well plate, approximately 10 million cells (for 10 plates, 100 

million cells) are required. 
1 5 Transfer the cells to a triangular reservoir boat, in order to dispense the cells into 

a 96 well dish, using a 12 channel pipette. Using a 12 channel pipette, transfer 200 ul 

of cells into each well (therefore adding 100, 000 cells per well). 

After all the plates have been seeded, 50 ul of the supernatants are transferred 

directly from the 96 well plate containing the supernatants into each well using a 12 
20 channel pipette. In addition, a dose of exogenous interferon gamma (0. 1, 1.0, 10 ng) 

is added to wells H9, H10, and HI 1 to serve as additional positive controls for the 

assay. 

The 96 well dishes containing Jurkat cells treated with supernatants are placed in 
an incubator for 48 hrs (note: this time is variable between 48-72 hrs). 35 ul samples 

25 from each well are then transferred to an opaque 96 well plate using a 12 channel 

pipette. The opaque plates should be covered (using sellophene covers) and stored at - 
20°C until SEAP assays are performed according to Example 17. The plates 
containing the remaining treated cells are placed at 4°C and serve as a source of material 
for repeating the assay on a specific well if desired. 

30 As a positive control, 100 Unit/ml interferon gamma can be used which is 

known to activate Jurkat T cells. Over 30 fold induction is typically observed in the 
positive control wells. 
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Example 14: High-Throughput Screening Assay Identifying Myeloid 
Activity 

The following protocol is used to assess myeloid activity by identifying factors, 
such as growth factors and cytokines, that may proliferate or differentiate myeloid cells. 
5 Myeloid cell activity is assessed using the GAS/SEAP/Neo construct produced in 

Example 12. Thus, factors that increase SEAP activity indicate the ability to activate the 
Jaks-STATS signal transduction pathway. The myeloid cell used in this assay is U937, 
a pre-monocyte cell line, although TF-1, HL60, or KG1 can be used. 

To transiently transfect U937 cells with the GAS/SEAP/Neo construct produced 
10 in Example 12, a DEAE-Dextran method (Kharbanda et. aL 1994, Cell Growth & 
Differentiation, 5:259-265) is used. First, harvest 2xlOe 7 U937 cells and wash with 
PBS. The U937 cells are usually grown in RPMI 1640 medium containing 10% heat- 
inactivated fetal bovine serum (FBS) supplemented with 100 units/ml penicillin and 100 
mg/mJ streptomycin. 

15 Next, suspend the cells in 1 ml of 20 mM Tris-HCl (pH 7.4) buffer containing 

0.5 mg/ml DEAE-Dextran, 8 ug GAS-SEAP2 plasmid DNA, 140 mM NaCl, 5 mM 
KCl 375 uM Na 2 HP0 4 .7H20, 1 mM MgCl 2 , and 675 uM CaCl 2 . Incubate at 37°C 
for 45 min. 

Wash the cells with RPMI 1640 medium containing 10% FBS and then 
20 resuspend in 10 ml complete medium and incubate at 37°C for 36 hr. 

The GAS-SEAP/U937 stable cells are obtained by growing the cells in 400 
ug/ml G418. The G418-free medium is used for routine growth but every one to two 
months, the cells should be re-grown in 400 ug/ml G418 for couple of passages. 

These cells are tested by harvesting 1x10 cells (this is enough for ten 96- well 
25 plates assay) and wash with PBS. Suspend the cells in 200 ml above described growth 
medium, with a final density of 5x10 s cells/ml. Plate 200 ul cells per well in the 96- 
well plate (or 1x10 s cells/well). 

Add 50 ul of the supernatant prepared by the protocol described in Example 11. 
Incubate at 37°C for 48 to 72 hr. As a positive control, 100 Unit/ml interferon gamma 
30 can be used which is known to activate U937 cells. Over 30 fold induction is typically 
observed in the positive control wells. SEAP assay the supernatant according to the 
protocol described in Example 17. 
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Example 15: High-Throughput Screening Assay Identifying Neuronal 
Activity. 

When cells undergo differentiation and proliferation, a group of genes are 
activated through many different signal transduction pathways. One of these genes. 
5 EGR1 (early growth response gene 1). is induced in various tissues and cell types upon 
activation. The promoter of EGR1 is responsible for such induction. Using the EGR1 
promoter linked to reporter molecules, activation of cells can be assessed. 

Particularly, the following protocol is used to assess neuronal activity in PC 12 
cell lines. PC 12 cells (rat phenochromocytoma cells) are known to proliferate and/or 
10 differentiate by activation with a number of mitogens, such as TPA (tetradecanoyl 

phorbol acetate). NGF (nerve growth factor), and EGF (epidermal growth factor). The 
EGR1 gene expression is activated during this treatment. Thus, by stably transfecting 
PC 12 cells with a construct containing an EGR promoter linked to SEAP reporter, 
activation of PCI 2 cells can be assessed. 
15 The EGR/SEAP reporter construct can be assembled by the following protocol. 

The EGR-1 promoter sequence (-633 to +l)(Sakamoto K et al., Oncogene 6:867-871 
(1991)) can be PCR amplified from human genomic DNA using the following primers: 

5' GCGCTCGAGGGATGACAGCGATAGAACCCCGG -3' (SEQ ID NO:6) 

5* GCGAAGCTTCGCGACTCCCCGGATCCGCCTC-3' (SEQ ID NO:7) 
20 Using the GAS:SEAP/Neo vector produced in Example 12, EGR1 amplified 

product can then be inserted into this vector. Linearize the GAS:SEAP/Neo vector 
using restriction enzymes XhoI/HindlH, removing the GAS/SV40 staffer. Restrict the 
EGR1 amplified product with these same enzymes. Ligate the vector and the EGR1 
promoter. 

25 To prepare 96 well-plates for cell culture, two mis of a coating solution ( 1 :30 

dilution of collagen type I (Upstate Biotech Inc. Cat#08-1 15) in 30% ethanol (filter 
sterilized)) is added per one 10 cm plate or 50 ml per well of the 96-well plate, and 
allowed to air dry for 2 hr. 

PC 12 cells are routinely grown in RPMI-1640 medium (Bio Whittaker) 

30 containing 10% horse serum (JRH BIOSCIENCES, Cat. # 12449-78P), 5% heat- 
inactivated fetal bovine serum (FBS) supplemented with 100 units/ml penicillin and 100 
ug/ml streptomycin on a precoated 10 cm tissue culture dish. One to four split is done 
every three to four days. Cells are removed from the plates by scraping and 
resuspended with pipetting up and down for more than 15 times. 

35 Transfect the EGR/SEAP/Neo construct into PC 12 using the Lipofectamine 

protocol described in Example 11. EGR-SEAP/PC12 stable cells are obtained by 
growing the cells in 300 ug/ml G418. The G418-free medium is used for routine 
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growth but every one to two months, the cells should be re-grown in 300 ug/ml G418 
for couple of passages. 

To assay for neuronal activity, a 10 cm plate with cells around 70 to 809c 
confluent is screened by removing the old medium. Wash the cells once with PBS 
5 (Phosphate buffered saline). Then starve the cells in low serum medium (RPMI-1640 
containing 1% horse serum and 0.5% FBS with antibiotics) overnight. 

The next morning, remove the medium and wash the cells with PBS. Scrape 
off the cells from the plate, suspend the cells well in 2 ml low serum medium. Count 

the cell number and add more low serum medium to reach final cell density as 5xl0 5 
10 cells/ml. 

Add 200 ul of the cell suspension to each well of 96-well plate (equivalent to 
lxlO 5 cells/well). Add 50 ul supernatant produced by Example 1 1 , 37°C for 48 to 72 
hr. As a positive control, a growth factor known to activate PC 12 cells through EGR 
can be used, such as 50 ng/ul of Neuronal Growth Factor (NGF). Over fifty-fold 
1 5 induction of SEAP is typically seen in the positive control wells. SEAP assay the 
supernatant according to Example 17. 

Example 16: High-Throughput Screening Assay for T-cell Activity 

NF-kB (Nuclear Factor kB) is a transcription factor activated by a wide variety 
20 of agents including the inflammatory cytokines 1L- 1 and TNF, CD30 and CD40, 
lymphotoxin-alpha and lymphotoxin-beta, by exposure to LPS or thrombin, and by 
expression of certain viral gene products. As a transcription factor, NF-kB regulates 
the expression of genes involved in immune cell activation, control of apoptosis (NF- 
kB appears to shield cells from apoptosis), B and T-cell development, anti-viral and 
25 antimicrobial responses, and multiple stress responses. 

In non-stimulated conditions, NF- kB is retained in the cytoplasm with I-kB 
(Inhibitor kB). However, upon stimulation, I- kB is phosphorylated and degraded, 
causing NF- kB to shuttle to the nucleus, thereby activating transcription of target 

genes. Target genes activated by NF- kB include IL-2, IL-6, GM-CSF, ICAM-1 and 
30 class 1 MHC. 

Due to its central role and ability to respond to a range of stimuli, reporter 
constructs utilizing the NF-kB promoter element are used to screen the supernatants 
produced in Example 1 1 . Activators or inhibitors of NF-kB would be useful in treating 
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diseases. For example, inhibitors of NF-kB could be used to treat those diseases 
related to the acute or chronic activation of NF-kB. such as rheumatoid arthritis. 

To construct a vector containing the NF-kB promoter element, a PCR based 
strategy is employed. The upstream primer contains four tandem copies of the NF-kB 

5 binding site (GGGGACTTTCCC) (SEQ ID NO:8), 18 bp of sequence complementary 
to the 5' end of the SV40 early promoter sequence, and is flanked with an Xhol site: 
5:GCGGCCTCGAGGGGACTTTCCCGGGGACTTTCCGGGGACTTTCCGGGAC 
TTTCC ATCCTGCC ATCTC AATT AG : 3 ' (SEQ ID NO:9) 

The downstream primer is complementary to the 3' end of the SV40 promoter 
10 and is flanked with a Hind III site: 

5 ' :GCGGC A AGCTTTTTGC AAAGCCTAGGC: 3 1 (SEQ ID NO:4) 

PCR amplification is performed using the SV40 promoter template present in 
the pB-gal promoter plasmid obtained from Clontech. The resulting PCR fragment is 
digested with Xhol and Hind III and subcloned into BLSK2-. (Stratagene) 
15 Sequencing with the T7 and T3 primers confirms the insert contains the following 
sequence: 

S^CTCGAGGGGACTTTCCCGGGGACTTTCCGGGGACTTTCCGGGACTTTCC 
ATCTGCCATCTCAATTAGTCAGCAACCATAGTCCCGCCCCTAACTCCGCCCA 
20 TCCCGCCCCTAACTCCGCCCAGTTCCGCCCATTCTCCGCCCCATGGCTGACT 
AATTTTTTTTATTTATGCAGAGGCCGAGGCCGCCTCGGCCTCTGAGCTATTC 
CAGAAGTAGTGAGGAGGCTTTTTTGGAGGCCTAGGCTTTTGCAAAAAGCTT: 
3' (SEQ ID NO: 10) 

25 Next, replace the S V40 minimal promoter element present in the pSEAP2- 

promoter plasmid (Clontech) with this NF-KB/SV40 fragment using Xhol and Hindlll. 

However, this vector does not contain a neomycin resistance gene, and therefore, is not 
preferred for mammalian expression systems. 

In order to generate stable mammalian cell lines, the NF-KB/SV40/SEAP 
30 cassette is removed from the above NF-kB/SEAP vector using restriction enzymes Sail 
and NotI, and inserted into a vector containing neomycin resistance. Particularly, the 
NF-KB/SV40/SEAP cassette was inserted into pGFP-1 (Clontech), replacing the GFP 
gene, after restricting pGFP-1 with Sail and Notl. 



WO 98/54963 



X1S98/U422 



250 

Once NF-kB/S V40/SEAP/Neo vector is created, stable Jurkat T-cells are 

created and maintained according to the protocol described in Example 13. Similarly, 
the method for assaying supernatants with these stable Jurkat T-cells is also described 
in Example 13. As a positive control, exogenous TNF alpha (0.1,1. 10 ng) is added to 
5 wells H9, H 1 0. and H 1 1 , with a 5- 1 0 fold activation typically observed. 

Example 17: Assay for SEAP Activity 

As a reporter molecule for the assays described in Examples 13-16, SEAP 
activity is assayed using the Tropix Phospho-light Kit (Cat. BP-400) according to the 
10 following general procedure. The Tropix Phospho-light Kit supplies the Dilution, 
Assay, and Reaction Buffers used below. 

Prime a dispenser with the 2.5x Dilution Buffer and dispense 15 jxl of 2.5x 
dilution buffer into Optiplates containing 35 jil of a supernatant. Seal the plates with a 

plastic sealer and incubate at 65°C for 30 min. Separate the Optiplates to avoid uneven 
15 heating. 

Cool the samples to room temperature for 15 minutes. Empty the dispenser and 

prime with the Assay Buffer. Add 50 uJ Assay Buffer and incubate at room 

temperature 5 min. Empty the dispenser and prime with the Reaction Buffer (see the 

table below). Add 50 u:l Reaction Buffer and incubate at room temperature for 20 

20 minutes. Since the intensity of the chemiluminescent signal is time dependent, and it 

takes about 10 minutes to read 5 plates on luminometer, one should treat 5 plates at each 
time and start the second set 10 minutes later. 

Read the relative light unit in the luminometer. Set H 12 as blank, and print the 
results. An increase in chemiluminescence indicates reporter activity. 

25 

Reaction Buffer Formulation: 

#of plates Rxn buffer diluent (ml) CSPD (ml) 



10 


60 


3 


1 1 


65 


3 25 


12 


70 


3 5 


13 


75 


3 75 


14 


80 


4 


15 


85 


4.25 


16 


90 


4 5 


17 


95 


4 75 


18 


100 


5 


19 


105 


5 25 


20 


1 10 


5.5 


21 


115 


5.75 


22 


120 


6 
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23 


125 


6.25 


24 


130 


6.5 


25 


135 


6.75 


26 


140 


7 


27 


145 


7.25 


28 


150 


7.5 


29 


155 


7.75 


30 


160 


8 


31 


165 


8.25 


32 


170 


8 5 


33 


175 


8 75 


34 


180 


9 


35 


185 


9 25 


36 


190 


9 5 


37 


195 


9 75 


38 


200 


10 


39 


205 


10 25 


40 


210 


10 5 


41 


215 


1075 


42 


220 


11 


43 


225 


1 1 25 


44 


230 


1 1 5 


45 


235 


11.75 


46 


240 


12 


47 


245 


12 25 


48 


250 


12.5 


49 


255 


12.75 


50 


260 


13 



Example 18: High-Throughput Screening Assay Identifying Changes in 

Small Molecule Concentration and Membrane Permeability 

Binding of a ligand to a receptor is known to alter intracellular levels of small 
5 molecules, such as calcium, potassium, sodium, and pH, as well as alter membrane 
potential. These alterations can be measured in an assay to identify supematants which 
bind to receptors of a particular cell. Although the following protocol describes an 
assay for calcium, this protocol can easily be modified to detect changes in potassium, 
sodium, pH, membrane potential, or any other small molecule which is detectable by a 

10 fluorescent probe. 

The following assay uses Fluorometric Imaging Plate Reader ("FLIPR") to 
measure changes in fluorescent molecules (Molecular Probes) that bind small 
molecules. Clearly, any fluorescent molecule detecting a small molecule can be used 
instead of the calcium fluorescent molecule, fluo-3, used here. 

15 For adherent cells, seed the cells at 10,000 -20,000 cells/well in a Co-star black 

96-welI plate with clear bottom. The plate is incubated in a C0 2 incubator for 20 hours. 
The adherent cells are washed two times in Biotek washer with 200 ul of HBSS 
(Hank's Balanced Salt Solution) leaving 100 ul of buffer after the Final wash. 
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A stock solution of 1 mg/mJ fluo-3 is made in 109c pluronic acid DMSO. To 
load the cells with fluo-3, 50 ul of 12 ug/ml fluo-3 is added to each well. The plate is 
incubated at 37°C in a CO : incubator for 60 min. The plate is washed four times in the 
Biotek washer with HBSS leaving 100 ul of buffer. 
5 For non-adherent cells, the cells are spun down from culture media. Cells are 

re-suspended to 2-5x1 0 6 cells/ml with HBSS in a 50-ml conical tube. 4 ul of 1 mg/ml 
fluo-3 solution in 10% pluronic acid DMSO is added to each ml of cell suspension. 
The tube is then placed in a 37°C water bath for 30-60 min. The cells are washed twice 
with HBSS, resuspended to lxlO 6 cells/ml, and dispensed into a microplate, 100 
10 ul/well. The plate is centnfuged at 1000 rpm for 5 min. The plate is then washed once 
in Denley CellWash with 200 ul, followed by an aspiration step to 100 ul final volume. 

For a non-cell based assay, each well contains a fluorescent molecule, such as 
fluo-3. The supernatant is added to the well, and a change in fluorescence is detected. 

To measure the fluorescence of intracellular calcium, the FLIPR is set for the 
15 following parameters: (1) System gain is 300-800 mW; (2) Exposure time is 0.4 

second; (3) Camera F/stop is F/2; (4) Excitation is 488 nm; (5) Emission is 530 nm; and 
(6) Sample addition is 50 ul. Increased emission at 530 nm indicates an extracellular 
signaling event which has resulted in an increase in the intracellular Ca++ 
concentration. 

20 

Example 19: High-Throughput Screening Assay Identifying Tyrosine 
Kinase Activity 

The Protein Tyrosine Kinases (PTK) represent a diverse group of 
transmembrane and cytoplasmic kinases. Within the Receptor Protein Tyrosine Kinase 

25 RPTK) group are receptors for a range of mitogenic and metabolic growth factors 
including the PDGF, FGF, EGF, NGF, HGF and Insulin receptor subfamilies. In 
addition there are a large family of RPTKs for which the corresponding ligand is 
unknown. Ligands for RPTKs include mainly secreted small proteins, but also 
membrane-bound and extracellular matrix proteins. 

30 Activation of RPTK by ligands involves ligand-mediated receptor dimerization, 

resulting in transphosphorylation of the receptor subunits and activation of the 
cytoplasmic tyrosine kinases. The cytoplasmic tyrosine kinases include receptor 
associated tyrosine kinases of the sre-family (e.g., sre, yes, lck, lyn, fyn) and non- 
receptor linked and cytosolic protein tyrosine kinases, such as the Jak family, members 

35 of which mediate signal transduction triggered by the cytokine superfamily of receptors 
(e.g., the Interleukins, Interferons, GM-CSF, and Leptin). 
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Because of the wide range of known factors capable of stimulating tyrosine 
kinase activity, the identification of novel human secreted proteins capable of activating 
tyrosine kinase signal transduction pathways are of interest. Therefore, the following 
protocol is designed to identify those novel human secreted proteins capable of 
5 activating the tyrosine kinase signal transduction pathways. 

Seed target cells (e.g., primary keratinocytcs) at a density of approximately 
25,000 cells per well in a 96 well Loprodyne Silent Screen Plates purchased from 
Nalge Nunc (Naperville, IL). The plates are sterilized with two 30 minute rinses with 
100% ethanol, rinsed with water and dried overnight. Some plates are coated for 2 hr 

10 with 100 ml of cell culture grade type I collagen (50 mg/ml), gelatin (2%) or polylysine 
(50 mg/ml), all of which can be purchased from Sigma Chemicals (St. Louis, MO) or 
10% Matrigel purchased from Becton Dickinson (Bedford, MA), or calf serum, rinsed 
with PBS and stored at 4°C. Cell growth on these plates is assayed by seeding 5,000 
cells/well in growth medium and indirect quantitation of cell number through use of 

15 alamarBlue as described by the manufacturer Alamar Biosciences, Inc. (Sacramento, 
CA) after 48 hr. Falcon plate covers #3071 from Becton Dickinson (Bedford,MA) are 
used to cover the Loprodyne Silent Screen Plates. Falcon Microtest III cell culture 
plates can also be used in some proliferation experiments. 

To prepare extracts, A431 cells are seeded onto the nylon membranes of 

20 Loprodyne plates (20,000/200mlAvell) and cultured overnight in complete medium. 
Cells are quiesced by incubation in serum-free basal medium for 24 hr. After 5-20 
minutes treatment with EGF (60ng/ml) or 50 ul of the supernatant produced in Example 
1 1, the medium was removed and 100 ml of extraction buffer ((20 mM HEPES pH 
7.5, 0.15 M NaCl, 1% Triton X-100, 0.1% SDS, 2 mM Na3V04, 2 mM Na4P207 

25 and a cocktail of protease inhibitors (# 1 836170) obtained from Boeheringer Mannheim 
(Indianapolis, IN) is added to each well and the plate is shaken on a rotating shaker for 
5 minutes at 4°C. The plate is then placed in a vacuum transfer manifold and the extract 
filtered through the 0.45 mm membrane bottoms of each well using house vacuum. 
Extracts are collected in a 96-well catch/assay plate in the bottom of the vacuum 

30 manifold and immediately placed on ice. To obtain extracts clarified by centrifugation, 
the content of each well, after detergent solubilization for 5 minutes, is removed and 
centrifuged for 15 minutes at 4°C at 16,000 x g. 

Test the filtered extracts for levels of tyrosine kinase activity. Although many 
methods of detecting tyrosine kinase activity are known, one method is described here. 

35 Generally, the tyrosine kinase activity of a supernatant is evaluated by 

determining its ability to phosphorylate a tyrosine residue on a specific substrate (a 



WO 98/54963 




biotinylated peptide). Biotinylated peptides thai can be used for this purpose include 
PSK1 (corresponding to amino acids 6-20 of the cell division kinase cdc2-p34) and 
PSK2 (corresponding to amino acids 1-17 of gastrin). Both peptides are substrates for 
a range of tyrosine kinases and are available from Boehringer Mannheim. 
5 The tyrosine kinase reaction is set up by adding the following components in 

order. First, add lOul of 5uM Biotinylated Peptide, then lOul ATP/Mg2+ (5mM 
ATP/50mM MgC^K then lOul of 5x Assay Buffer (40mM imidazole hydrochloride, 
pH7.3, 40 mM beta-glycerophosphate, ImM EOT A, lOOmM MgCb, 5 niM MnCh ? 
0.5 mg/ml BSA), then 5ul of Sodium Vanadates ImM), and then 5ul of water. Mix the 

10 components gently and preincubate the reaction mix at 30°C for 2 min. Initial the 
reaction by adding lOul of the control enzyme or the filtered supernatant. 

The tyrosine kinase assay reaction is then terminated by adding 10 ul of 120mm 
EDTA and place the reactions on ice. 

Tyrosine kinase activity is determined by transferring 50 ul aliquot of reaction 

1 5 mixture to a microliter plate (MTP) module and incubating at 37°C for 20 min. This 
allows the streptavadin coated 96 well plate to associate w r ith the biotinylated peptide. 
Wash the MTP module with 300ul/well of PBS four times. Next add 75 ul of anti- 
phospotyrosine antibody conjugated to horse radish peroxidase(anti-P-Tyr- 

POD(0.5u/ml)) to each well and incubate at 37°C for one hour. Wash the well as 
20 above. 

Next add lOOul of peroxidase substrate solution (Boehringer Mannheim) and 
incubate at room temperature for at least 5 mins (up to 30 min). Measure the 
absorbance of the sample at 405 nm by using ELIS A reader. The level of bound 
peroxidase activity is quantitated using an ELISA reader and reflects the level of 
25 tyrosine kinase activity. 

Example 20: High-Throughput Screening Assay Identifying 
Phosphorylation Activity 

As a potential alternative and/or compliment to the assay of protein tyrosine 
30 kinase activity described in Example 19, an assay which detects activation 

(phosphorylation) of major intracellular signal transduction intermediates can also be 
used. For example, as described below one particular assay can detect tyrosine 
phosphorylation of the Erk-1 and Erk-2 kinases. However, phosphorylation of other 
molecules, such as Raf, JNK, p38 MAP, Map kinase kinase (MEK). MEK kinase, 
35 Src, Muscle specific kinase (MuSK), IRAK, Tec, and Janus, as well as any other 
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phosphoserine, phosphotyrosine, or phosphothreonine molecule, can be detected by 
substituting these molecules for Erk-1 or Erk-2 in the following assay. 

Specifically, assay plates are made by coaling the wells of a 96- well ELISA 
plate with 0. ImJ of protein G (lug/ml) for 2 hr at room temp, (RT). The plates are then 
5 rinsed with PBS and blocked with 3% BS A/PBS for 1 hr at RT. The protein G plates 
are then treated with 2 commercial monoclonal antibodies (lOOng/well) against Erk-1 
and Erk-2 ( 1 hr at RT) (Santa Cruz Biotechnology). (To detect other molecules, this 
step can easily be modified by substituting a monoclonal antibody detecting any of the 

above described molecules.) After 3-5 rinses with PBS, the plates are stored at 4°C 
10 until use. 

A431 cells are seeded at 20,000/well in a 96-well Loprodyne filterplate and 
cultured overnight in growth medium. The cells are then starved for 48 hr in basal 
medium (DMEM) and then treated with EGF (6ng/well) or 50 ul of the supernatants 
obtained in Example 1 1 for 5-20 minutes. The cells are then solubilized and extracts 

15 filtered directly into the assay plate. 

After incubation with the extract for 1 hr at RT, the wells are again rinsed. As a 
positive control, a commercial preparation of MAP kinase (lOng/well) is used in place 
of A43 1 extract. Plates are then treated with a commercial polyclonal (rabbit) antibody 
(lug/ml) which specifically recognizes the phosphorylated epitope of the Erk-1 and 

20 Erk-2 kinases (1 hr at RT). This antibody is biotinylated by standard procedures. The 
bound polyclonal antibody is then quantitated by successive incubations with 
Europium-streptavidin and Europium fluorescence enhancing reagent in the Wallac 
DELFIA instrument (time-resolved fluorescence). An increased fluorescent signal over 
background indicates a phosphorylation. 

25 

Example 21: Method of Determining Alterations in a Gene 
Corresponding to a Polynucleotide 

RNA isolated from entire families or individual patients presenting with a 
phenotype of interest (such as a disease) is be isolated. cDNA is then generated from 
30 these RNA samples using protocols known in the art. (See, Sambrook.) The cDNA is 
then used as a template for PCR, employing primers surrounding regions of interest in 

SEQ ID NO:X. Suggested PCR conditions consist of 35 cycles at 95°C for 30 

seconds; 60-120 seconds at 52-58°C; and 60-120 seconds at 70°C, using buffer 

solutions described in Sidransky, D., et al.. Science 252:706 (1991). 
35 PCR products are then sequenced using primers labeled at their 5' end with T4 

polynucleotide kinase, employing SequiTherm Polymerase. (Epicentre Technologies). 
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The intron-exon borders of selected exons is also determined and genomic PCR 
products analyzed to confirm the results. PCR products harboring suspected mutations 
is then cloned and sequenced to validate the results of the direct sequencing. 

PCR products is cloned into T-tailed vectors as described in Holton, T.A. and 
5 Graham, M.W., Nucleic Acids Research, 19: 1 156 (1991) and sequenced with T7 
polymerase (United States Biochemical). Affected individuals are identified by 
mutations not present in unaffected individuals. 

Genomic rearrangements are also observed as a method of determining 
alterations in a gene corresponding to a polynucleotide. Genomic clones isolated 

10 according to Example 2 are nick-translated with digoxigemndeoxy-uridine 5'- 

triphosphate (Boehringer Manheim), and FISH performed as described in Johnson, 
Cg. et al., Methods Cell Biol. 35:73-99 (1991). Hybridization with the labeled probe is 
carried out using a vast excess of human cot-1 DNA for specific hybridization to the 
corresponding genomic locus. 

1 5 Chromosomes are counterstained with 4,6-diamino-2-phenyIidole and 

propidium iodide, producing a combination of C- and R-bands. Aligned images for 
precise mapping are obtained using a triple-band filter set (Chroma Technology, 
Brattlcboro, VT) in combination with a cooled charge-coupled device camera 
(Photometries, Tucson, AZ) and variable excitation wavelength filters. (Johnson, Cv. 

20 et al., Genet. Anal, Tech. AppL, 8:75 (1991).) Image collection, analysis and 

chromosomal fractional length measurements are performed using the ISee Graphical 
Program System. (Inovision Corporation, Durham, NC.) Chromosome alterations of 
the genomic region hybridized by the probe are identified as insertions, deletions, and 
translocations. These alterations are used as a diagnostic marker for an associated 

25 disease. 

Example 22: Method of Detecting Abnormal Levels of a Polypeptide in a 
Biological Sample 

A polypeptide of the present invention can be detected in a biological sample, 
30 and if an increased or decreased level of the polypeptide is detected, this polypeptide is 
a marker for a particular phenotype. Methods of detection are numerous, and thus, it is 
understood that one skilled in the art can modify the following assay to fit their 
particular needs. 

For example, antibody-sandwich ELIS As are used to detect polypeptides in a 
35 sample, preferably a biological sample. Wells of a microliter plate are coated with 

specific antibodies, at a final concentration of 0.2 to 10 ug/ml. The antibodies are either 
monoclonal or polyclonal and are produced by the method described in Example 10. 
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The wells are blocked so that non-specific binding of the polypeptide to the well is 
reduced. 

The coated wells are then incubated for > 2 hours at RT with a sample 
containing the polypeptide. Preferably, serial dilutions of the sample should be used to 
5 validate results. The plates are then washed three times with deionized or distilled water 
to remove unbounded polypeptide. 

Next, 50 ul of specific antibody-alkaline phosphatase conjugate, at a 
concentration of 25-400 ng, is added and incubated for 2 hours at room temperature. 
The plates are again washed three times with deionized or distilled water to remove 
10 unbounded conjugate. 

Add 75 ul of 4-methylumbelliferyl phosphate (MUP) or p-nitrophenyl 
phosphate (NPP) substrate solution to each well and incubate 1 hour at room 
temperature. Measure the reaction by a microtiter plate reader. Prepare a standard 
curve, using serial dilutions of a control sample, and plot polypeptide concentration on 
15 the X-axis (log scale) and fluorescence or absorbance of the Y-axis (linear scale). 

Interpolate the concentration of the polypeptide in the sample using the standard curve. 

Example 23: Formulating a Polypeptide 

The secreted polypeptide composition will be formulated and dosed in a fashion 

20 consistent with good medical practice, taking into account the clinical condition of the 
individual patient (especially the side effects of treatment with the secreted polypeptide 
alone), the site of delivery, the method of administration, the scheduling of 
administration, and other factors known to practitioners. The "effective amount" for 
purposes herein is thus determined by such considerations. 

25 As a general proposition, the total pharmaceutically effective amount of secreted 

polypeptide administered parenterally per dose will be in the range of about 1 [ig/kg/day 
to 10 mg/kg/day of patient body weight, although, as noted above, this will be subject 
to therapeutic discretion. More preferably, this dose is at least 0.01 mg/kg/day, and 
most preferably for humans between about 0.01 and 1 mg/kg/day for the hormone. If 

30 given continuously, the secreted polypeptide is typically administered at a dose rate of 
about 1 |!g/kg/hour to about 50 |!g/kg/hour, either by 1-4 injections per day or by 
continuous subcutaneous infusions, for example, using a mini-pump. An intravenous 
bag solution may also be employed. The length of treatment needed to observe changes 
and the interval following treatment for responses to occur appears to vary depending 

35 on the desired effect. 

Pharmaceutical compositions containing the secreted protein of the invention are 
administered orally, rectally, parenterally. intracistemally. intravaginally, 



WO 98/54963 




intraperitoneally, topically (as by powders, ointments, gels, drops or transdermal 
patch), bucally, or as an oral or nasal spray. "Pharmaceutic ally acceptable earner" refers 
to a non-toxic solid, semisolid or liquid filler, diluent, encapsulating material or 
formulation auxiliary of any type. The term "parenteral" as used herein refers to modes 
5 of administration which include intravenous, intramuscular, intraperitoneal, intrasternal, 
subcutaneous and intraarticular injection and infusion. 

The secreted polypeptide is also suitably administered by sustained-release 
systems. Suitable examples of sustained-release compositions include semi-permeable 
polymer matrices in the form of shaped articles, e.g., films, or mirocapsules. 

10 Sustained-release matrices include polylactides (U.S. Pat. No. 3,773,919, EP 58,481), 
copolymers of L-glutamic acid and gamma-ethyl-L-glutamate (Sidman, U. et al., 
Biopolymers 22:547-556 (1983)), poly (2- hydroxyethyl methacrylate) (R. Langer et 
al., J. Biomed. Mater. Res. 15:167-277 (1981), and R. Langer, Chem. Tech. 12:98- 
105 (1982) ), ethylene vinyl acetate (R. Langer et al.) or poly-D- (-)-3-hydroxybutyric 

15 acid(EP 133,988). Sustained-release compositions also include hposomally entrapped 
polypeptides. Liposomes containing the secreted polypeptide are prepared by methods 
known per se: DE 3,218,121; Epstein et al., Proc. Natl. Acad. Sci. USA 82:3688-3692 
(1985); Hwang et al., Proc. Natl. Acad. Sci. USA 77:4030-4034 ( 1980): EP 52,322; 
EP 36,676; EP 88,046; EP 143,949; EP 142,641; Japanese Pat. Appl. 83-1 18008; 

20 U.S. Pat. Nos. 4,485,045 and 4,544,545; and EP 102,324. Ordinarily, the liposomes 
are of the small (about 200-800 Angstroms) unilamellar type in which the lipid content 
is greater than about 30 mol. percent cholesterol, the selected proportion being adjusted 
for the optimal secreted polypeptide therapy. 

For parenteral administration, in one embodiment, the secreted polypeptide is 

25 formulated generally by mixing it at the desired degree of purity, in a unit dosage 

injectable form (solution, suspension, or emulsion), with a pharmaceutical ly acceptable 
carrier, i.e., one that is non-toxic to recipients at the dosages and concentrations 
employed and is compatible with other ingredients of the formulation. For example, the 
formulation preferably does not include oxidizing agents and other compounds that are 

30 known to be deleterious to polypeptides. 

Generally, the formulations are prepared by contacting the polypeptide 
uniformly and intimately with liquid carriers or finely divided solid carriers or both. 
Then, if necessary, the product is shaped into the desired formulation. Preferably the 
carrier is a parenteral carrier, more preferably a solution that is isotonic with the blood 

35 of the recipient. Examples of such carrier vehicles include water, saline. Ringer's 
solution, and dextrose solution. Non-aqueous vehicles such as fixed oils and ethyl 
oleate are also useful herein, as well as liposomes. 
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The carrier suitably contains minor amounts of additives such as substances that 
enhance isotonicity and chemical stability. Such materials are non-toxic to recipients at 
the dosages and concentrations employed, and include buffers such as phosphate, 
citrate, succinate, acetic acid, and other organic acids or their salts; antioxidants such as 
5 ascorbic acid; low molecular weight (less than about ten residues) polypeptides, e.g., 
polyarginine or tripeptides; proteins, such as serum albumin, gelatin, or 
immunoglobulins; hydrophilic polymers such as polyvinylpyrrolidone; amino acids, 
such as glycine, glutamic acid, aspartic acid, or arginine; monosaccharides, 
disacchandes, and other carbohydrates including cellulose or its derivatives, glucose, 
10 manose, or dextnns; chelating agents such as EDTA; sugar alcohols such as mannitol or 
sorbitol; counterions such as sodium; and/or nonionic surfactants such as polysorbates, 
poloxamers, or PEG. 

The secreted polypeptide is typically formulated in such vehicles at a 
concentration of about 0. 1 mg/rnl to 100 mg/ml, preferably 1- 10 mg/ml, at a pH of 
15 about 3 to 8. It will be understood that the use of certain of the foregoing excipients, 
carriers, or stabilizers will result in the formation of polypeptide salts. 

Any polypeptide to be used for therapeutic administration can be sterile. 
Sterility is readily accomplished by filtration through sterile filtration membranes (e.g., 
0.2 micron membranes). Therapeutic polypeptide compositions generally are placed 
20 into a container having a sterile access port, for example, an intravenous solution bag or 
vial having a stopper pierceable by a hypodermic injection needle. 

Polypeptides ordinarily will be stored in unit or multi-dose containers, for 
example, sealed ampoules or vials, as an aqueous solution or as a lyophilized 
formulation for reconstitution. As an example of a lyophilized formulation, 10-ml vials 
25 are filled with 5 ml of sterile-filtered 1% (w/v) aqueous polypeptide solution, and the 
resulting mixture is lyophilized. The infusion solution is prepared by reconstituting the 
lyophilized polypeptide using bacteriostatic Water-for-Injection. 

The invention also provides a pharmaceutical pack or kit comprising one or 
more containers filled with one or more of the ingredients of the pharmaceutical 
30 compositions of the invention. Associated with such container(s) can be a notice in the 
form prescribed by a governmental agency regulating the manufacture, use or sale of 
pharmaceuticals or biological products, which notice reflects approval by the agency of 
manufacture, use or sale for human administration. In addition, the polypeptides of the 
present invention may be employed in conjunction with other therapeutic compounds. 
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Example 24: Method of Treating Decreased Levels of the Polypep tide 

It will be appreciated that conditions caused by a decrease in the standard or 
normal expression level of a secreted protein in an individual can be treated by 
administering the polypeptide of the present invention, preferably in the secreted form. 
5 Thus, the invention also provides a method of treatment of an individual in need of an 
increased level of the polypeptide comprising administering to such an individual a 
pharmaceutical composition comprising an amount of the polypeptide to increase the 
activity level of the polypeptide in such an individual. 

For example, a patient with decreased levels of a polypeptide receives a daily 
10 dose 0.1-100 ug/kg of the polypeptide for six consecutive days. Preferably, the 

polypeptide is in the secreted form. The exact details of the dosing scheme, based on 
administration and formulation, are provided in Example 23. 

Example 25: Method of Treating Increased Levels of the Polypeptide 

15 Antisense technology is used to inhibit production of a polypeptide of the 

present invention. This technology is one example of a method of decreasing levels of 
a polypeptide, preferably a secreted form, due to a variety of etiologies, such as cancer. 

For example, a patient diagnosed with abnormally increased levels of a 
polypeptide is administered intravenously antisense polynucleotides at 0.5, 1.0, 1.5, 

20 2.0 and 3.0 mg/kg day for 21 days. This treatment is repeated after a 7-day rest period 
if the treatment was well tolerated. The formulation of the antisense polynucleotide is 
provided in Example 23. 

Example 26: Method of Treatment Using Gene Therapy 

25 One method of gene therapy transplants fibroblasts, which are capable of 

expressing a polypeptide, onto a patient. Generally, fibroblasts are obtained from a 
subject by skin biopsy. The resulting tissue is placed in tissue-culture medium and 
separated into small pieces. Small chunks of the tissue are placed on a wet surface of a 
tissue culture flask, approximately ten pieces are placed in each flask. The flask is 

30 turned upside down, closed tight and left at room temperature over night. After 24 

hours at room temperature, the flask is inverted and the chunks of tissue remain fixed to 
the bottom of the flask and fresh media (e.g., Hams F12 media, with 10% FBS, 

penicillin and streptomycin) is added. The flasks are then incubated at 37°C for 

approximately one week. 
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At this time, fresh media is added and subsequently changed every several days. 
After an additional two weeks in culture, a monolayer of fibroblasts emerge. The 
monolayer is trypsinized and scaled into larger flasks. 

pMV-7 (Kirschmeier, P.T. et al., DNA, 7:219-25 (1988)), flanked by the long 
5 terminal repeats of the Moloney murine sarcoma virus, is digested with EcoRI and 
Hindlll and subsequently treated with calf intestinal phosphatase. The linear vector is 
fractionated on agarose gel and purified, using glass beads. 

The cDNA encoding a polypeptide of the present invention can be amplified 
using PCR primers which correspond to the 5' and 3' end sequences respectively as set 
10 forth in Example 1. Preferably, the 5' primer contains an EcoRI site and the 3' primer 
includes a Hindlll site. Equal quantities of the Moloney murine sarcoma virus linear 
backbone and the amplified EcoRI and Hindlll fragment are added together, in the 
presence of T4 DNA ligase. The resulting mixture is maintained under conditions 
appropriate for ligation of the two fragments. The ligation mixture is then used to 
15 transform bacteria HB101, which are then plated onto agar containing kanamycin for 
the purpose of confirming that the vector has the gene of interest properly inserted. 

The amphotropic pA317 or GP+aml2 packaging cells are grown in tissue 
culture to confluent density in Dulbecco's Modified Eagles Medium (DMEM) with 10% 
calf serum (CS), penicillin and streptomycin. The MSV vector containing the gene is 
20 then added to the media and the packaging cells transduced with the vector. The 
packaging cells now produce infectious viral particles containing the gene (the 
packaging cells are now referred to as producer cells). 

Fresh media is added to the transduced producer cells, and subsequently, the 
media is harvested from a 10 cm plate of confluent producer cells. The spent media, 
25 containing the infectious viral particles, is filtered through a millipore filter to remove 
detached producer cells and this media is then used to infect fibroblast cells. Media is 
removed from a sub-confluent plate of fibroblasts and quickly replaced with the media 
from the producer cells. This media is removed and replaced with fresh media. If the 
titer of virus is high, then virtually all fibroblasts will be infected and no selection is 
30 required. If the titer is very low, then it is necessary to use a retroviral vector that has a 
selectable marker, such as neo or his. Once the fibroblasts have been efficiently 
infected, the fibroblasts are analyzed to determine whether protein is produced. 

The engineered fibroblasts are then transplanted onto the host, either alone or 
after having been grown to confluence on cytodex 3 microcarrier beads. 
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Fxam ple 27: Method of Treatment Lsine Gene Therapy - In Vivo 

Another aspect of the present invention is using in vivo gene therapy 
methods to treat disorders, diseases and conditions. The gene therapy method 
relates to the introduction of naked nucleic acid (DNA, RNA, and antisense 
5 DNA or RNA) sequences into an animal to increase or decrease the expression 
of the polypeptide of the present invention. A polynucleotide of the present 
invention may be operatively linked to a promoter or any other genetic elements 
necessary for the expression of the encoded polypeptide by the target tissue. 
Such gene therapy and delivery techniques and methods are known in the an, 
10 see, for example, WO90/1 1092, W098/1 1779; U.S. Patent NO. 5693622, 
5705151, 5580859; Tabata H. et al. (1997) Cardiovasc. Res. 35(3):470-479, 
Chao J et al. (1997) Pharmacol. Res. 35(6):5 17-522, Wolff J.A. (1997) 
Neuromuscul. Disord. 7(5):314-318, Schwartz B. et al. (1996) Gene Ther. 
3(5):405-41 1, Tsurumi Y. et al. (1996) Circulation 94(12):3281-3290 
15 (incorporated herein by reference). 

The polynucleotide constructs of the present invention may be delivered 
by any method that delivers injectable materials to the cells of an animal, such 
as, injection into the interstitial space of tissues (heart, muscle, skin, lung, liver, 
intestine and the like). These polynucleotide constructs can be delivered in a 
20 pharmaceutically acceptable liquid or aqueous carrier. 

The term "naked" polynucleotide, DNA or RNA. refers to sequences 
that are free from any delivery vehicle that acts to assist, promote, or facilitate 
entry into the cell, including viral sequences, viral particles, liposome 
formulations, lipofectin or precipitating agents and the like. However, the 
25 polynucleotides may also be delivered in liposome formulations (such as those 
taught in Feigner P.L. et al. (1995) Ann. NY Acad. Sci. 772: 126-139 and 
Abdallah B. et al. (1995) Biol. Cell 85(1 >: 1-7) which can be prepared by 
methods well known to those skilled in the art. 

The polynucleotide vector constructs of the present invention used in 
30 the gene therapy method are preferably constructs that will not integrate into the 
host genome nor will they contain sequences that allow for replication. Any 
strong promoter known to those skilled in the art can be used for driving the 
expression of DNA. Unlike other gene therapies techniques, one major 
advantage of introducing naked nucleic acid sequences into target cells is the 
35 transitory nature of the polynucleotide synthesis in the cells. Studies have 
shown that non-replicating DNA sequences can be introduced into cells to 
provide production of the desired polypeptide for periods of up to six months. 
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The polynucleotide construct of the present invention can be delivered to 
the interstitial space of tissues within the an animal, including of muscle, skin, 
brain, lung, liver, spleen, bone marrow, thymus, heart, lymph, blood, bone, 
cartilage, pancreas, kidney, gall bladder, stomach, intestine, testis, ovary, 
5 uterus, rectum, nervous system, eye, gland, and connective tissue. Interstitial 
space of the tissues comprises the intercellular fluid, mucopolysaccharide matrix 
among the reticular fibers of organ tissues, elastic fibers in the walls of vessels or 
chambers, collagen fibers of fibrous tissues, or that same matrix within 
connective tissue ensheathing muscle cells or in the lacunae of bone. It is 
10 similarly the space occupied by the plasma of the circulation and the lymph fluid 
of the lymphatic channels. Delivery to the interstitial space of muscle tissue is 
preferred for the reasons discussed below. They may be conveniently delivered 
by injection into the tissues comprising these cells. They are preferably delivered 
to and expressed in persistent, non-dividing cells which are differentiated, 
1 5 although delivery and expression may be achieved in non-differentiated or less 
completely differentiated cells, such as, for example, stem cells of blood or skin 
fibroblasts. In vivo muscle cells are particularly competent in their ability to take 
up and express polynucleotides. 

For the naked polynucleotide injection, an effective dosage amount of 
20 DNA or RNA will be in the range of from about 0.05 g/kg body weight to about 
50 mg/kg body weight. Preferably the dosage will be from about 0.005 mg/kg 
to about 20 mg/kg and more preferably from about 0.05 mg/kg to about 5 mg/kg. 
Of course, as the artisan ordinary skill will appreciate, this dosage will vary 
according to the tissue site of injection. The appropriate and effective dosage of 
25 nucleic acid sequence can readily be determined by those of ordinary skill in the 
art and may depend on the condition being treated and the route of 
administration. The preferred route of administration is by the parenteral route of 
injection into the interstitial space of tissues. However, other parenteral routes 
may also be used, such as, inhalation of an aerosol formulation particularly for 
30 delivery to lungs or bronchial tissues, throat or mucous membranes of the nose. 
In addition, naked polynucleotide constructs can be delivered to arteries during 
angioplasty by the catheter used in the procedure. 

The dose response effects of injected polynucleotide in muscle in vivo is 
determined as follows. Suitable template DNA for production of mRNA coding 
35 for the polypeptide of the present invention is prepared in accordance with a 
standard recombinant DNA methodology. The template DNA, which may be 
either circular or linear, is either used as naked DNA or complexed with 



WO 98/54963 




T/XS98/I1422 



26^ 



liposomes. The quadriceps muscles of mice are then injected w ith various 
amounts of the template DNA. 

Five to six week old female and male Balb/C mice are anesthetized by 
intraperitoneal injection with 0.3 ml of 2.5% Avertin. A 1.5 cm incision is made 

5 on the anterior thigh, and the quadriceps muscle is directly visualized. The 

template DNA is injected in 0. 1 ml of carrier in a 1 cc syringe through a 27 gauge 
needle over one minute, approximately 0.5 cm from the distal insertion site of the 
muscle into the knee and about 0.2 cm deep. A suture is placed over the 
injection site for future localization, and the skin is closed with stainless steel 

10 clips. 

After an appropriate incubation time (e.g., 7 days) muscle extracts are prepared 
by excising the entire quadriceps. Every fifth 15 um cross-section of the individual 
quadriceps muscles is histochemically stained for protein expression. A time course for 
protein expression may be done in a similar fashion except that quadriceps from 

15 different mice are harvested at different times. Persistence of DNA in muscle following 
injection may be determined by Southern blot analysis after preparing total cellular DNA 
and HIRT supernatants from injected and control mice. The results of the above 
experimentation in mice can be use to extrapolate proper dosages and other treatment 
parameters in humans and other animals using naked DNA of the present invention. 

20 It will be clear that the invention may be practiced otherwise than as particularly 

described in the foregoing description and examples. Numerous modifications and 
variations of the present invention are possible in light of the above teachings and, 
therefore, are within the scope of the appended claims. 

The entire disclosure of each document cited (including patents, patent 

25 applications, journal articles, abstracts, laboratory manuals, books, or other 

disclosures) in the Background of the Invention, Detailed Description, and Examples is 
hereby incorporated herein by reference. 
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Sequence Listing 

(1) GENERAL INFORMATION: 

(i) APPLICANT: Human Genome Sciences, Inc.. et al . 

(ii) TITLE OF INVENTION: 207 Human Secreted Proteins 
(lii) NUMBER OF SEQUENCES: 800 



(IV) CORRESPONDENCE ADDRESS: 

(A) ADDRESSEE: Hunan Genome Sciences, Inc. 

(3) STREET: 9410 Key West Avenue 

20 (O CITY: Rockville 

(D) STATE: Maryland 

(E) COUNTRY: USA 
(Fl ZIP : 20850 



30 (v) COMPUTER READABLE FORM: 

(A) MEDIUM TYPE: Diskette, 3.50 inch, 1.4Mb storage 

(B) COMPUTER: HP Vectra 486/33 

(C) OPERATING SYSTEM: MSDOS version 6.2 
<D) SOFTWARE: ASCII Text 



(vi) CURRENT APPLICATION DATA: 

(A) APPLICATION NUMBER: 

(B) FILING DATE: 

(C) CLASSIFICATION: 

(vii) PRIOR APPLICATION DATA: 

(A) APPLICATION NUMBER: 

(B) FILING DATE: 
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(vi 11) ATTORNEY / AGE! TT INFORMATION: 
(A) NAME: Ken ley K. Hoover 
(3) REGISTRATION NUMBER : 40,302 
(C) REFERENCE /DOCKET NUMBER: PZ0C7PCT 

(vi) TELECOMMUNICATION INFORMATION : 

(A) TELEPHONE: (301) 309-3504 
(3) TELEFAX: (301) 309-8439 
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(2) INFORMATION FOR SEQ ID NC: 1: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 73 3 base pairs 

(B) TYPE: nucleic acid 
<C) STRANDEDNESS : double 
CD) TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 1: 

GGGATCCGGA GCCCAAATCT TCTGACAAAA CTCACACATG CCCACCGTGC CCAGCACCTG 60 

AATTCGAGGG TGCACCGTCA GTCTTCCTCT TCCCCCCAAA ACCCAAGGAC ACCCTCATGA 12 0 

TCTCCCGGAC TCCTGAGGTC ACATGCGTGG TGGTGGACGT AAGCCACGAA GACCCTGAGG 180 

TCAAGTTCAA CTGGTACGTG GACGGCGTGG AGGTGCATAA TGCCAAGACA AAGCCGCGGG 240 

AGGAGCAGTA CAACAGCACG TAC CGTGTGG TCAGCGTCCT CACCGTCCTG CACCAGGACT 300 

GGCTGAATGG CAAGGAGTAC AAGTGCAAGG TCTCCAACAA AGCCCTCCCA ACCCCCATCG 360 

AGAAAACCAT CTCCAAAGCC AAAGGGCAGC CCCGAGAACC ACAGGTGTAC ACCCTGCCCC 420 

CATCCCGGGA TGAGCTGACC AAGAACCAGG TCAGCCTGAC CTGCCTGGTC AAAGGCTTCT 480 

ATC CAAGCGA CATCGCCGTG GAGTGGGAGA GCAATGGGCA GCCGGAGAAC AACTACAAGA 540 

CCACGCCTCC CGTGCTGGAC TCCGACGGCT CCTTCTTCCT CTACAGCAAG CTCACCGTGG 600 

ACAAGAGCAG GTGGCAGCAG GGGAACGTCT TCTCATGCTC CGTGATGCAT GAGGCTCTGC 660 

ACAACCACTA CACGCAGAAG AGCCTCTCCC TGTCTCCGGG TAAATGAGTG CGACGGCCGC 720 

GACTCTAGAG GAT 733 



60 



(2) INFORMATION FOR SEQ ID NO : 2: 
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ii; SEQUENCE CHARACTERISTICS : 

(A) LENGTH : 5 ar.ino acids 

(B) TYPE: amino acid 
5 (D) TOPOLOGY: linear 

ixi) SEQUENCE DESCRIPTION: SEQ ID MC: 2: 

Tro Ser Xaa Trp Ser 
10 1 5 



15 (2) INFORMATION FOR SEQ ID NO: 3: 

(i) SEQUENCE CHARACTERISTICS : 

(A) LENGTH: 86 base pairs 

(B) TYPE: nucleic acid 
20 (C) STRANDEDNESS : double 

(D) TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTION : SEQ ID NO: 3: 
25 GCGCCTCGAG ATTTCCCCGA AATCTAGATT TCCCCGAAAT GATTTCCCCG AAATGATTTC 
CCCGAAATAT CTGCCATCTC AATTAG 



30 

(2) INFORMATION FOR SEQ ID NO : 4: 

(i) SEQUENCE CHARACTERISTICS: 
35 (A) LENGTH: 27 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: double 

(D) TOPOLOGY : linear 

40 (xi) SEQUENCE DESCRIPTION: SEQ ID NO: 4: 

GCGGCAAGCT TTTTGCAAAG CCTAGGC 



45 

(2) INFORMATION FOR SEQ ID NO : 5: 

(i) SEQUENCE CHARACTERISTICS: 
50 (A) LENGTH: 271 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: double 

(D) TOPOLOGY: linear 

55 (xi) SEQUENCE DESCRIPTION : SEQ ID NO: 5: 

CTCGAGATTT CCCCGAAATC TAGATTTCCC CGAAATGATT TCCCCGAAAT GATTTCCCCG 



AAATATCTGC CATCTCAATT AGTCAGCAAC CATAGTCCCG CCCCTAACTC CGCCCATCCC 

60 



120 
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GCCCCTAACT CC'^CCCAGTT CCGCCCATTC TCCGCCCCAT GGCTGACTAA TTTTTTTTAT 
TTATGCAGAG 3CCGAGCKXG CCTCGGCCTC TGAGC7ATTC CAGAAGTAGT GAGGAGGCTT 
5 TTTTGGAGGC CTAG3CTTTT GCAAAAAGCT T 



is: 



10 (2) INFORMATION FOR SEQ ID NO : 6: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 3 2 base pairs 

(B) TYPE : nucleic acid 
15 (C) STRANDEDNESS : double 

(D) TOPOLOGY: linear 

(Xi) SEQUENCE DESCRIPTION: SEQ ID NO: 6: 
20 GCGCTCGAGG GATGACAGCG ATAGAACCCC GG 



25 (2) INFORMATION FOR SEQ ID NO: 7: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH : 31 base pairs 

(B) TYPE: nucleic acid 
30 (C) STRANDEDNESS: double 

(D) TOPOLOGY: linear 

(Xi) SEQUENCE DESCRIPTION' SEQ ID NO: 7: 



GCGAAGCTTC GCGACTCCCC GGATCCGCCT C 



40 

(2) INFORMATION FOR SEQ ID NO: 8. 

(l) SEQUENCE CHARACTERISTICS: 

(A) LENGTH : 12 base pairs 
45 (B) TYPE: nucleic acid 

(C) STRANDEDNESS: double 
O; TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTION : SEQ ID NO: 3: 

50 

GGGGACTTTC CC 



55 

(2) INFORMATION FOR SEQ ID NO : 9: 

(i) SEQUENCE CHARACTERISTICS 

(A) LENGTH: 7 3 base pairs 
60 (B) TYPE: nucleic acid 
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; C i STRANDEDNE H S : doub I e 
■3) TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 9: 

GCGGCCTCGA GGGGACTTTC CCGGGGAZTT TCCGGGGACT TTCCGGGACT TTCCATCCTG 

CCATCTCAAT TAG 



(2) INFORMATION FOR SEQ ID NC : 10: 



(2) INFORMATION FOR SEQ ID NO: 11: 



(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 2526 base pairs 
40 (B) TYPE : nucleic acid 

(C) STRANDEDNESS : double 

(D) TOPOLOGY: linear 



73 



15 (i) SEQUENCE CHARACTERISTICS : 

(A) LENGTH: 256 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : double 

(D) TOPOLOGY: linear 

20 

(xi) SEQUENCE DESCRIPTION : SEQ ID NO: 10: 
CTCGAGGGGA CTTTCCCGGG GACTTTCCGG GGACTTTCCG GGACTTTCCA TCTGCCATCT 60 
25 CAATTAGTCA GCAACCATAG TCCCGCCCCT AACTCCGCCC ATCCCGCCCC TAACTCCGCC 120 
CAGTTCCGCC CATTCTCCGC CCCATGOCTG ACTAATTTTT TTTATTTATG CAGAGGCCGA 180 
GGCCGCCTCG GCCTCTGAGC TATTCCAGAA GTAGTGAGGA GGGTTTTTTG GAGGCCTAGG 240 
CTTTTGCAAA AAGCTT 256 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 11: 

GACAGGCTAT CCGAGAATCT GAGAGCTGGG CCCGGCAATT CCTCCAGYTA CCCTTGTGAC 60 

CTAAGTCCAG TCACACATTT CCCAAAGTTT CTCTTTGTCA TAACCCTGGT CTGGCTGGTT 12 0 

50 TTGRGGRCTT GAGAATGGGT CAGGGACTCC AGGCCAAGTC CAACAGAGAC CCCAAACCCA 180 

CCACACACCA GCAGCCACAA CCTCACCACC AACAAAGAGG ACTTTTGTGG GGCCACAAGT 24 0 

AAGAGGTCAT TTCTGGAATG GACTCAGACC TTTAAACAGG AGAGTTGAGC ACTTCCAGKS 30C 

AGTTTTTAAG CAAGGCATGG GGAACAGGGA ATAGAACCTT TCAAAGAGGT TGCCCAGAGA 360 
AAAGCTGGGC CTCTTGCATT CGGCTTCCTT GGAGCAGCCT CTTCTGGCAG AAAGCCATCA 
60 GGTGCTCAAT CATCTTCTCC TGGCCAAGGC TCTGACCATG CTTAGTACTG GAATAGAGGT 
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GGCCAGGCCC CCAGCGACTC TTCTTGGCCT GATGTTTGTC CTCACAGGCA TCCCACGTGG 
CCTGA3ATGA TTCAGAACAA ATCATGCTAA CTTTGAATCC ATCCAGCCAC TTGCAAATGA 
TAATCAGAAG TCAGCTTGTT CACTGTTAGA AAGAAACTAA CAAAAGAGAA CCCAGAGCAA 
TCTAGAATCT TTGAGTGCTT GGCTTTCCAA GGATACTGCG GAGACTCTGG CCAAGCTGAT 
GAMCTTCTGA ARTGTCACTG GCACCATATG CAACAAGAAC CACCATTCAC TGAGTAGCTA 
ATX3GGTTTGG GGCCTGGGAC ATTCCATCTG AGGTCCTTCC TGAAC ATG TC ACTCCACAGC 
AGAGGACCGG TTGCAGCTTA CCCAGAACCA CTCCTCCAGG AGAGCTGGAT GTTTTGCGTG 
CAACACCTTG AGCACTGACT GCTATTGTTC AAAAAAAGCC TTTGCTGCAT TCGGAGGACT 
GCCGCGTGCC CTGAGGTGAC TTCCTAACTA TGTGGTTTCA TTAGCGAATT TATTTTTTGT 
GCTGGGTGGA CATTTGTATT TTGTTAGGTT GCTGTTTAAG CTCAAGTTTG CTGTGCTCTC 
TGCAGCTACA AAACATCTTG GCATATTTAA GAKTGGCTTT TATAAATAGC TTTATTCTGA 
TATTAATCAG ATTCCCAACT TTACTGAGAA TTAAGGACTG GGGTACTTTA AAGAAATGCA 
AATAGCAATT GAAGAACCAC TGCTGCAGGT GGTAGCCCTG GCTAGACTGA ATTACACTAG 
AAATCAGC C A GAAGGAAGCG TCCTTGGGAT CCCAGATCAC TCTTTTTTTT TTTTTTTTTA 
AAAGGGGCAG CCCGTTGATG GCTCATCTCT CTGAATAACA GTTACGTCTT CATATCGATA 
CCAGATGCCT T CTTC ATCAT GCCACTGAAG CCACTCACCA CCTTCAAGAA CATGCCAACG 
TCTGTCAGAT TCACTTACCC ACAAACAAGG AGGCACGTTT GGCACAAAGT GTTGTCCTCC 
AGGTCCAAGT GGACTCTACA GAGTGCTTGA CCTCAACACA CTGGATTCCA GGTGGACTGG 
ACGAAGAGCA GGCAAAGACA CGGGAACTGA AAAACTCCAC AGGGTTTGGA GAATAGAAAT 
GAAAAGCCAC GTCATATAAC TCAAGAATAA ATGGTGTTTT GGAAATTTTA AAATTATCAT 
CGAAGGTGGT GAAACTATTT CAGGCCCAAA TGAAAGGAAA TCGCCAGTTG GGGATGAAAT 
CACAGAGCCT GTGTTTTATG ATATGGTTGG ATGTCCACTG ATGAAATTTT AAAGGAGTTT 
CATTTTTAAA AGTGCGCATG ATTCTACATA TGAGAATTCT TTAGGCCAAG AAACTGTCCT 
TGGCTCAGAG GTGTTGGGAA TTAAAGCAGA GAGAAGCCAT TCGTGATGCT TAGAACCAAG 
GATGGTCATG TACACAAAGA CCATCGAGAC GGCCATTCTT GTTTACAAAA CACTTACCAA 
GAAAGCACTT TGTAGGGGAA GTTTAGTAAG TTCTTCTCAT TTCATTATGT TTCTTCCAAG 
GAAACAGGAG AGACTGAATT AATAATTCTC TCTTTCCTCT TAAGCACTTT TAAAATAATA 
AAGTACATCT TGAAATTTGG GGGGGCATCT CTGATTTAAA AAAAGAAAAA GGCTGCTTGA 
TGTATGTTAT GCAGAGACAC TCTGCCTCTG GTGGCTGCAG AGC AAT AC CC AAGCCTCATT 
TGGAAGGCTC AACATTTGGA ATCCAC7TT AATTGATTAA TCGTCAATTC ATGTGGCCTT 
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1200 

i:eo 

1320 

1380 

1440 

1500 

1560 

1520 

1680 

1740 

1800 

1860 

1920 

1980 

2040 

2100 

2160 

2220 

2280 



5 



WO 98/54963 ^^CT US98 1 1422 



ACGGGATGGT GG3TCTGGGA CCCCAATTCA TTCTTATCTG CCAAAGAATT ATCTAGAAGC 2 34 0 



10 TACCCA 



4CC 
460 



ACATCAAATA CCAGCACCCC ACCTGCACAA TGGGGGTGGA AAACTTTTGT ATCCCTAAGC 
ATATTATTTT ATA3TGTCTG CCVTGCCATG TGGAAATACT TTATTTTTAA CCTCAGGATT 
TAAATAAAGT AAAC AC TATG A CATTT AAAA AAAAAAAAAA AAAACTCGAG GGGGGCCCGG 2 520 

2526 



30 



40 



50 



15 (2) INFORMATION FOR SEQ ID NO: 12: 

(1) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 1131 bass pairs 

(B) TYPE: nucleic acid 
20 (C) STRANDEDNESS : double 

(D) TOPOLOGY: linear 

(XI) SEQUENCE DESCRIPTION: SEQ ID NO: 12: 
25 CACTGCACCA GCTTTGTTAT CTGTAAAATG ATGATAATAC CAACACCTTC TTCTTGGGGT 6C 
ACTGAAGATG AGAGAACATG ATATGTGTAA AGTGCCTTCC ACAATACCCA GAACATAGCA 120 
AACATGTAAT GAATGTAGTA ATAGTAATTA TTTTATTTTC TTTTGATTCA GTTGGGACTA 130 
TGTTCAGCTG TAACAGAATA CCCAAAATAA CTGTTTTAAA CAAATTAAAG TTTWGTTGTG 240 
AAGTTTTGTT ACGAATTCAG ACAATCCAGG GCTTTTATAG ATGCACCAGG ATCAGCAGGT 300 
35 ACAAAGGCAT CTTTCCTGAT TTCTGCCAGT CTCAATGCAT GGGTTGCAAT CCAGARTCCA 360 
RGATGGCAGT TCCAGCCCTG GTTACGCCCA TATTAGCACA CAGAAAGAAA GAGAAAGGGA 42 0 

TGTGCCTCTT CACTTTAATC ATAGCTCCCA CTAGATGCAC CCACTACTTC TGCTGATACT 480 
CCATTAGCTA ATGCTTGCTT ACATGGTCAC ACTTAGTTTC CAGAGAGACA TGTCTGGACA 54 0 

GTCATGTGCT CAATTAATAT CCAAGTGTCC AATTACTGAG AAAAAAAGAA ACTAGCACCT 
45 TTGCTTGGTT GCATTCCTCT TAGCATAAGC CACATTCTTT TTATGAAGTT GTCCTCAGTT 

ACTTGGATGC CTCAGTTGTC CTTTCAWTTA GAAAWGCYCC TKGGACAYCC TGAAWCTGAC 720 
TTCTTTTGTC ATCAGCACCA TCACTACCAC TGCCYTCTTC AAAGCCACCA CGTTCTGTCC 780 
CCAGGATGGT TGCAACAACC ACC AT AGGG A CTTTTTGCCT TCTACTTCCA CACAATAGNC 
CAGAGTAAGC TTTTGAAAAT GTAGGTCAGA TCATGTCTCT CTCTTCCTCT TCAAAACCCT 
55 CCCGATGGCT TTTCATATTA CTCAAAAGAA AACCTAAAAC TTTGCTGTGA GATCTATGTG 
ACCCGGCTTA TTCTTCCTCT TACTTTATCT CTGTATTGCT CTTCCTCACT CTACTCCAGC 
CATCCCACCT CCTTGCTGCT TGTCCTATAC TCCTAAAAGA AGTTCAGTCT TCCCTTATGA 

60 



600 
660 



840 
900 
960 
1020 
1080 



WO 98/54963 




TATTTGCACT TAAAATAGAA AAAAAAAAAA AAAAAAAACT CGAGGGGGGC C 1131 



( 2 ) INFORMATION FOR SEQ ID NO : 13 : 

(i) SEQUENCE CHARACTERISTICS : 

(A) LENGTH : 941 base pairs 
\Q (3) TYPE: nucleic acid 

(C) STRANDEDNESS : double 

(D) TOPOLOGY: linear 



15 



25 



35 



45 



50 



(1) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 843 base pairs 
55 (3) TYPE: nucleic acid 

(C) STRANDEDNESS: double 

(D) TOPOLOGY: linear 



(Xl) SEQUENCE DESCRIPTION: SEQ ID NO: 13: 

GGCACGAGTA GCATTTCATT TAATCTGCAG GTATATTCTC CCAACAGTTT ATTGTCATGT 60 

GATGTCCTCA GCCAAGATTG TRAGGCAGAG AGGAGCTGTC CCAACCTACT ATACCACCGA 120 

20 GGCTGGAGAG ATCATATTTT TGGTATTAAA CTGGAGTCTC TCCATCCTTC ACATTGTTGA 180 

TGTCCTCTGT AGCAAACCGG AAAAGTCAGT GACAGAAGAT GCCGCTAGCG GTTTGAGCCA 240 

GAGAATGACA GCTCTGGTTT GGAGAAAAGG GCCGGATGGT GGCTCTAGAA AGCCCATCCT 300 

TCTGCTCTTC TTTTTTCTCC CCCTTATATT GTGCTTTCAT TCATTCATTC ATT CAT CAAA 360 

CATTTGTTGA GCACCTATTA TGTGTCAAGC TCTGTGCTAG CCTCTGGAAA ACCTGCCCTC 420 

30 ATGTAGCTCA CTGTGGAGTA GGAGAAACAA TGACTACACT ATGATAAGCA CGGGTTGTCA 480 

GGGTCTCACA GAGCAGTGGC CCCTCATCCA GACCGATGAG GTCAAAGAAG GCATCCAGGC 540 

GAGGATGGTG TCAGAGCTAA CTGAAGAATG AGAGGGAGCT GCACCASCAG GGGTTGGAAC 600 

TGAAGGTGGC AGTGCCTGGA GTCTTGATTC CAGCAGAGGG AGAGCAGTCT GTGAAAAGGC 660 

ACCAAGGGTG GGAGAGGGCA GAGCACATGG AGGAACTTCA GGTAGTTCTG GATGGCSCTG 72 0 
40 GGGCAAAGCT AGAGAGGTAA GAAGAATCTA CAAATGTTCC TCGAGTTACA TGAACTTCCA 
TCCCAATAAA CCCATTGGAA ACGAAAAATT TAAGTCAGAA GTGCATTTAA GGCTGGTCCG 
AGTAGAATGA TTTTTACAAC GAATTGATCA CAACCAGTTA CAGATGTCTT TGTTCCTTCT 



780 
840 
900 



CCACTCCCAC TGCTTCACCT GACTAGCCTT TAAAAAAAAA A 941 



(2) INFORMATION FOR SEQ ID NO: 14: 



60 



(xi) SEQUENCE DESCRIPTION : SEQ ID NO: 14: 



WO 98/54963 ^H^T I S98/1 1422 



10 



35 



55 



(2) INFORMATION FOR SEQ ID NO: 15: 



AGCAAGGGCA CCCCCTCCTT TCCCCCCACA CCCCAYTTCT CATGGCTCTT CTTTCTCTCA 



CNAGGGATAA CCCCAAAGrTT GGGAAATAAA CCCTCAATTA AAGGGGGAAC CAAAAAGC I 3 
GGAAGTTCCC CCCCGCGGTG CC3GCCNGtIT CTAGGAACTA GTGGAATCCC CCGGGGCTGC 

AGGGAATTCG GCACC-GAC-TG GGAATGTTGT TTGTATGATA CTATTTCCAC AAWATGCATT 13C 

GAGACTTGGT KTGTGGCCTA GGACATGGTC AATT>CTTTYT AAATATTCCG TGAATTTCTT 24 j 

TAGTGCATAT TCTCZGATGG GGGCTGTOGG GACAGAGTTC TAAATATGCC CATTAGATTA 3 30 

AATCTCTTC A TTCTGTTGCT CACATCTTCT ATATCCTTAT TAATCTGTCA ATCTCTTCAA 3 50 

GAGAGGTGTT ATTAAAATCT CTCACTGTAT GTGTCACTTT GCCCTTAAAA TTCTGATGAT 42 C 

15 TTGCTTTATA AATGGTTATA ACCATTTTCC AGGAAGAACA TTAAAGAACT TTCCATTGGC 48 0 

ATTATCCAGT TTCCCTCAAA ATACTGGTTT TTTTTATTTT GGCTNCTAAG CAGCTATGAA 54 0 

TCCAGTTTCT CAGAAGCCCT TGTCTCAAGG CATTTGTTTC CAGATTACCT TGTTAGCATC 500 

20 

CACACTATGG GCTATTTTAG AAAAACAAAA AAAGTATCAA AATCATATAG CTATGATTTT 560 

CCTGTGCTTG AAGGAGCCTT AAAGCTCATC TAGTCCAGCC AGTATTTGTT CATCCAAATT 72 C 

25 CTGCCAAGAA ATCTCTATTG TCAAGATATT CTTTACCATC TTTGGGACAT TCTCATTATT 780 

AGAAACAAAT CCTAAGAAGA AATTCTGCCA TAKACAACCC ATCCGTTCTT TAAAAAAAAA 340 
AAA 

30 



343 



(i) SEQUENCE CHARACTERISTICS : 

(A) LENGTH :^ 1018 base pairs 

(B) TYPE : nucleic acid 

(C) STRAND EDNESS : double 
40 (D) TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 15: 

CTGTAATTTT TAATTTTCAT AT ACCGTGC T TTGATTCTAA TTTTATTTTT TGAGTTCTCT 60 

45 

GAAGGTTACA TATACAGAGT GCTTCAGGAA TGATCATTTT GTTATTATTC ATGCTTCTTA 120 
ACAATGTTGT TTTAGTCCAA GAAGATAATT GCCAGAGAAA GAATACAGTG CAGGAAAGAA 180 
50 GARGCTGGAG CCAGTGGTGA AGARGGATTG AGARGACAGA CA.TTGTGGGA ATGAAATCAT 240 
GAATAATCGT GTTTTTGAAT TGTCCAAAAA CTTCTACAAA CCATGAAATG TTGGAGTTTA 300 
AATCTAATTG TTGAAAAATT CCC'CACATTC CTTGTATCCC TTAGGTTGAG CATAATTCCA 



360 



CATCCGTGGA CTGATGCACT TC^CAAGAGG GGGCCTCATT AACTCTTCCG AGGCAGCAGC 42° 



48C 



60 TCTCATGCTT A GGTT AGAAA AGGGCACAA3 GTAAGGAAGC CCTTGGGAAT AGGCTGAATC 54 0 



WO 98 54963 MCT 1IS98/ 1 1 422 



TGGCTATCTA ATTTGGTGCC AAATACTTAA TGTGCTTGAA TTT AAAAACA GCAAACATGT 60 0 

AGAAAGGTAA TTATAATTAT GAGGCCAGTT CTTTAAGCTA GCT1TTTTTC CCCTCTCAAA 560 

CAGCATATTG GCTTGGATGT CAGCAGGAGA AAGTGTTTTT TGCAATACAC ATAATGCATA '2 ' 

TATGGTCCTG TTAGCAATCT ATAGAAAATA GATATTGCTC ATTAAGGTAA ATATTTTTGT 780 

10 TGATGAATGA TCTGGAATGG TCTGGACTTG TTGTGTGAAC AGGAAATTGC TCTGTAGGCT 840 

TTGACTTGT3 AGGTAAAGAG TGAGGCTGGT AAGATTAATT AAAGTAAATA CTGTGACAAT 900 

AGGATGTCAA AA^CAAAAAC GTGTTTCTGA AACTCAAGGA ATTAATGACA CATAGGGAAG 960 

TTTTTGCCAT ATT AAGCAT A GAGTAGGAGA GGCAAGTCAA GAATAAAAAA AAAAAAAA 1018 



15 



20 



30 



40 



50 



55 



(2) INFORMATION FOR SEQ ID NO: 16: 



(i) SEQUENCE CHARACTERISTICS : 

(A) LENGTH: 661 base pairs 
25 (B) TYPE: nucleic acid 

(C) STRANDEDNESS : double 

(D) TOPOLOGY: linear 



AGACAGGCAA GGAAGAAGCT TGTTTTGAGG ACAGAATTTT CTAGATCACT CAGCACCATC 
TC<XTITTTGG GGCTTTTTGT TTTATTTTGT TTTTGAGACG GGGTCTCGCT CTGTCGCCCA 
N 



(2) INFORMATION FOR SEQ ID NO: 17: 
60 ti) SEQUENCE CHARACTERISTICS : 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 16: 

TTTAAGAAAT TAGTGAATCC CCGGNTGCAG GGAATTCGGC ACGAGGAGGA GGCCGTCAGC 60 

TGGCAGGAGC GCAGGATGGC AGCTGYTCCC CCGGGTTGCA CCCCCCCAGY TCTGCTGGAC 120 

35 ATAAGYTGGT TAACAGAGAG CCTGGGAGCT GGGCAGCCTG T AC CTGTGGA GTGCCGGCAC 180 

CGCCTGGAGG TGGCTGGGCC AAGGAAGGGG CCTCTGAGCC CAGCATGGAT GCCTGCCTAT 240 



300 



GCCTGCCAGC GCCCTACGCC CCTCACACAC CACAACACTG C^CTMTCCGA GCTGCTGGAG 

CATGGAGTGT GTGAGGAGGT GGAGAGAGTT CGGCGCTCAG AGAGGTACCA GACCATGAAG 360 

GTGCGCAGGG CAGGGCTCGG ACCTACCCCA GGAATGTCCT GCCCTGGGAA TGACAACACA 420 

45 GTCCACACCA TGCACGGGGA GGCAAACAGG GGCAGCTGAC CCAGCCCAGG GGTCAGANGA 480 

GGTCTTGCCG AGGAAGTGGC AGCTAAGCTG ATACCTGATA TGCACWAGKC AGCCARGYGG 540 



600 
660 
661 



98/54963 ^^>CT I S98/1 1 



(A) LENGTH : 553 base pairs 
(3) TYPE: nuclei- acid 
( C ) STRANDEDNESS : dcub 1 e 
{ D ) TOPOLOGY : 1 i near 

5 

(XI) SEQUENCE DESCRIPTION: SEQ ID NO: 17: 
GGCACAGGGC TATTTGCCCC TCTCTCCACA TGACAGAACT GCTCTAAGTT TCTTTGCTGC 
10 TCTTCTCAGC TGTCAGACGG CTTGCTGCTT GTTTTCCACA CCACCATGTC TATTCTTTGC 120 
TGTCCTTWAC TCTGCCTGTT TTTTTCCTTT TGTATTTCTT CTG3C TCTTG TCCCTTTTCC 
CACGTGTCWC AGCTTTCCTT TATTGCCACT TTCAGTCAGA GCA3TCCTGT GCTTCTGGTG 
CCGGCATACA ATACTTACTT GAGTTTCTTG GCTTTTCTTG ACTGTGCATC TCTTACTTCA 
ACATAGGAAT AGCCTGTCAT AGAATTTCTC CAGTTCCAGG GCTCAAGA(3G GAGAGTGCCA 360 
20 GAAAATTGAG ACTGTTTTCC CTGTCTTGGA TTGAATTCAT AAAGCAAAAC CAGTGTTTGT 420 
GTGAGGGTTT GCTGTGTCAT GCCTATAGGT TGTTTGGGTG CAAACCTATA GAATC C AGCC 480 
TGCGAAAAGA AAGRAAC C AG AGAATANCAG CATCAGAACA ATGCTTGAOA TCATTTCTCA 540 
ATCAAGCAGT CCA 553 



15 



25 



30 



40 



50 



(2) INFORMATION FOR SEQ ID NO : 13: 



(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 869 base pairs 
35 (B) TYPE: nucleic acid 

(C) STRANDEDNESS : double 

(D) TOPOLOGY: linear 



60 



CACAAGTGCT CCCAGCTCCA TGGGAGASTG AGGTAGGAAC ATCACTTGAG CCCAGGAAGT 



60 



180 
240 
3CC 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 18: 
GGCACGAGCT GCCAACACTG AGGTCTTCGT GGCTTCTCAC ATCTAGATGT ATCCCTCTCA 60 
AATCTATCCT CTATCCAGGC ACCAGATTGA GGTATCTAAA ATGTCAACTT TCCAGTTACT 120 
45 CCTTCTTATA CTAGCCCAAT CAACTTACAA GATAAAGTCC AAGCCCCTTC ATATGACAAA 180 
CCACACCCTG CTTAACTCTC CA3GTTTGAA TCCTTCATCT CCTACTTTAA ACTTTAAAAC 
CCAGCAGCAC GAAAGTGTCT CCTATGCATG TTGCCATATG CGTTCTCTCC ATCATGCATT 
TGCCTGAGCA AGATGTCTTG AGTTAACATC TTATTCTTTA AGACTCATTG TGGTGGTAGA 360 
CAGCCTTTAA TAACGGATCC TTGGCCAGGC ACAGTGACTC ACACCTGTAA TCCCAGAACT 
55 TTGAAAGGCC AAAGAAGGAA GAAAGCTTGA GGCCAGTAGT TTGAGACCAG CCTGGGAAAC 

AGAGAGATAT CCCATCTGTA CCAAAAATTT AAAAAAATAT TAGCAGGGAG TAGTGGCATG 54 0 

600 



240 
300 



420 
480 



WO 98/54963 HC T I S98 1 1422 



2-76 



CAAGGCTGCA GTGAACCATG ATCAGAACA? TGCANTCCAG CTTGGGTAAC AGAGTGAGAC 660 
CTTAGGTCAG AAAAATGAAT AAATAAGCAT AAAATTTTAA AAACTTAGCC AGGCATGGTG 72 C 

5 GCACACATCT GTGGTCCCTG CTACTTAGGA GGCTGAGGTG AGAGGATCCT TGAGCCCAGG 



AAACCCTGCC AAAAAAAAAA AAAAAAAC7 

10 



30 



AGGTCAAC AC TACAGTGAGC TATGATTGTG CCACTAAACT CCAACCTGGG TGAAAAAGCA 840 



869 



15 



25 



35 



45 



55 



(2) INFORMATION FOR SEQ ID NO: 19: 



(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 959 base pairs 

(B) TYPE: nucleic acid 
{C) STRANDEDNESS : double 

20 (D) TOPOLOGY: linear 



(XI) SEQUENCE DESCRIPTION: SEQ ID NO: 19: 

GGCGAGCCGA GATCGTGCCA TTGCACTCCA GCCTGGGCAA CAAGAGTGAA ACTCTGTCTC 60 

AAAAAAAAAA AATTATAATA CTATATGCCA TAAAATGACA TTTCATATTT AAAGAGTTTT 120 

TTAAAACTCT TGTATTCACA TGCCATAATT TGAAACCCTA TTTCACTGAA TGAGAATGGT 180 

30 ATCTGTTGTC CTGATTTTTT CATTTTTATC CTTAACAATT TCCACCACAG CCAGTGCATA 240 

TAATGGCAAT GACACCCAGG GATGGAATGA TAAGTTCCAT CRCMGCTCAG TCAAGACGCA 300 
GACTTGATGT GGCCCCAACA ACAGTCAATA ATGGAGTCTC CAAAATAAAG CTCTATAGGA 



360 



AAGGTAAATA CCCGCTGCAC AAGAAACCAC AGCATCTAGG TTCTAACCCC ATCTCTATGA 420 
AGAGCTTGCT GGGAGAGTTT TGACATTWAA CAATCTGTCT GATKGCCAAT TTTYTTCTTC 



480 



40 TATAAAATGA TAATGTTKGA YTCAAAGATC CAAAGTCAAT TCATGGTCTA AAACTTAATG 54 0 

ATTTTTTTAG GTTTTGKGAC ATTTCACTGT ACACTGTAGT AATTTATATC TTATTTTCCC 600 

ACTAATTTAG AAAAATATYT AAATGATCCT TAATTGGCAA TGGGTCCTAA GAATTTTGTT 660 
TTAAATCCCT GTTACCCAAA AGAGCCCTTT TTTGTATCTC GCAGTAGTTA CAAGGATCTT 
TCTAAATCTT AAAAAAAAAA AAAAAAGAAA GAAAGAAAAG AAAAGAAAAA AAGTCAGCCG 

50 GGCGTGGTGG CTCATGCCTG TAATCCCAGC ACTTTGGGAC CAAGGTGGAC AGATCACGAG 840 



720 
780 



900 



GTCAGGAGAT GGAGACCATC CCGGCCAACA TGGAGAAACC CTGTCTCTAC TAAAAAAAAA 
AAAAACTCGA GGGGGGCCCG GTACCCAATN CGCCGGCTAG TGGTCGTAAA ACAATCAAA 959 



60 



(2) INFORMATION FOR SEQ ID NO: 20: 



WO 98/54963 



fCT l S98/11422 



10 



20 



277 



(il SEQUENCE CHARACTERISTICS : 

[A; LENGTH: 14 46 base pairs 
13) TYPE : nucleic acid 

(C) STRANDEDNESS: double 

(D) TOPOLOGY : linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID N j : 2C : 

CSGGGCAGGG CTGTGTGGCA CCGCCAGGGA GCGGGCCCAC CTGAGTCACT TTATTGGGTT 60 

CAGTCAACAC TTTCTTGCTC CCTGTTTTCT CTTCTGTGG3 AT GATCTC AG ATGCAGGGGC 12 C 

TGGTTTTG3G GTTTTCCTGC TTGTGCCAAG GGCTGGACA: TGCTGGGGGG CTGGAAAGCC 130 

15 CCTCCCTTCC TGTCCTTCTG TGGCCTCCAT CCCCTCATGG GTGCTGCCAT CCTTCCTGGA 240 

GAGAGGGAG3 TGAAAGCTGG TGTGAGCCCA GTGGGTTCCC GCCCACTCAC CCAGGAGCTG 3 00 

GCTGGGCCAG GACCGGGAGA GGGAGCACTG CTGCCCTCCT GGCCCTGCTC CTTCCGCAGT 360 

TAGGGGTGGA CCGAGCCTCG CTTTCCCCAC TGTTCTGGAG GGAAGGGGAA GGAGGGGG TC 420 

TTCAGGCTGG AGCCAGGCTG GGGGTGCTGG GTGGAGAGAT GAGATTTAGG GGGTGCCTCA 480 

25 TGGGGTGGGC AGGCCTGGGG TGAAATRAGA AAGGCCCAGA ACGTGCAGGT CTGCGGAGGG 540 

GAAGTGTCCT GAGTGAAGGA GGGGACCCCC ATCCTGGGGG ATGCTGGGAG TGAGTGAGTG 600 

AGATGGCTGA GTGAGGGTTA TGGGGAGCCT GAGGTTTTAT GGGCCTGTGT ATCCCCTTCT 660 

30 

CCCGGCCCCA GCCTGCCTCC CTCCTGCCCG CCTGGCCCAC AGGTCTCCCT CTGGTCCCTG 720 

TCCCTCTGGT GGTTGGGGAT GGAGCGGCAG CAAGGGGTGT AATGGGGCTG GGTTCTGTCT 780 

35 TCTACAGGCC ACCCCGAGGT CCTCAGTGGT TGCCTGGGGA GCCGGACGGG GCTCCTGAGG 840 

GGTACAGGTT GGGTGGGCCC TCCCTGAGGG TCTGGGGTCA GGCTTTGGCT CTGCTGCCTC 90 0 

TCAGTCACCA AGTCACCTCC CTCTGAAAAT CCAGTCCCTT CTTTGGATGT CCTTGTGAGT 960 

CACTCTGGGC CTGGCTGTCG TCCCTCCTCA GCTTCTTGTT CCTGGGACAA GGGTCAAGCC 1020 

AGGATGGGCC CAGGCCTGGG ATCCCCCACC CCAGGACCCC CAGGCCCCCT CCCCTGCTGC 1080 

45 TTTGCGGGGG GCAGGGCAGA AATGGACTCC TTTTGGGTCC CCGAGGTGGG GTCCCCTCCC 1140 

AGCCCTGCAT CCTCCGTGCC STAGACCTGC TCCCCAGAGG AGGGGCCTTG AC C CAC AGGA 1200 

CGTGTGGTGG CGCCTGGCAC TCAGGGACCC CCAGCTGCCC CAGCCCTGGT CTCTGGCGCA 1260 

50 

TCTCTTCCCT CTTGTCCCGA AGATCTGCGC CTCTAGTGCC TTTTGAGGGG TTCCCATCAT 1320 

CCCTCCCTGA TATTGTATTG AAAATATTAT GCACACTGTT CATGCTTCTA CTAATCAATA 13 80 

55 AACGCTTTAT TTAAAGCCAA AAAAAAAAAA AAAAAACTCG AGGGGGGGCC CGTACCCAAT 1440 



40 



TCGCCA 



1446 



60 



WO 98/54963 





278 



(2' I NFORMAT I ON FOR SEQ ID NO : 21: 



(i) SEQUENCE CHARACTERISTICS: 



5 



(A) LEMGTK: 1471 base pairs 

(B) TYPE: nucleic acid 

(C) S7RANDEDNESS : double 

(D) TOPOLOGY: linear 



10 



ixi) SEQUENCE DESCRIPTION: SEQ ID NO: 21: 



CAAAAAATAA TAATGATAAT 7TAAAATAAA TAAGTAACTA AT AAAAAG AT TTTATATCCC 
AGTCTTATGA TGTTGGTTGG CAAGGCTAGA TAAAAAGATG TTAGAATGAA AGAACATATT 



TTTAGTGATA TGTAAATGAA GGATTCTACA ATAGTCATAT ATTTTTATAT GAATGAATGT 
TGGGTTGGGC TGGAGAGGTA TGTGTGTGTA AATATAAAGG TCTCACATTC AGAGTATAGC 
20 TCTGAAATAA TGGAACTCAT GTCTACAATT CAACATGCAT CTGTATAGTT ACATCTCATG 
TAAATATACA CAGACATATT TTGCAGCCAG TAATTGACAG TTAATGTCCA AAACAGGTGA 
TTGATAGGTA ACAGAAATTA GATAACCACC AATTTTGCCC AAGAGAAAGA CTAGAAGGAC 



TAAAAGCAGT TGAATGTATG GTACTGACAT TGTCATAAGC AGTCTGATAA CCAGTTTATT 
GAAACGTGTG CATTAACAGA GAATTTAATT TTAAACCCAT AATTTCTCCT ATCCATTAAA 
30 ATATTATAAT TGTTAGTAGT ATGAAACCAA CAGGAAATGT TTTTTAATCA TTTAGTGAGG 
TGATTCATTT GTTTCATGGG CAAACACTAT CCAGGAAAAG CCTTGCTTGC CTGTTTCCCA 
AAGAGCTCTA AGAAATAGAA TCAAGTGTAA AATGGTTCAG ACCATTCAGG ATTTCTTGTC 



ACTCTTCTCA ACCCCGATCT TCCTGTTATT ACTGATGTTT GAAAC CCTGT CATTAGCCCC 
GGCCTGGTTA AAGCCCCTCA GAGTCACCTC TCATTCATAG CAATAGAATT CAACCCCAAG 
40 TGGTTGATGG TGTCCCCAGC ACAGCCGAGA GACCTGATCT CIGGATTCAG TGCTTTTAGC 
TCTTCGAGTT TACCCTAAGA TACCTTCGGG CAATATTTTT AACCAACCCA AAAGCTCTTC 
AGGTCATTTC TGAAGAGGAC AAGGTGAATC TTGGCTTGGA ACACCATTTT TGGGCTCTTG 



CTACTGAATG AATCAGAAAG GAATTTTTTC TGAAGAGCAT TAGAAAGTAA AGGAGATGTT 
AAAATAAGTT CTTGAAGTAT GTTTTATATT TATCTAAAAC ACTGATTTTA AAAGTTTACA 
50 TTCAAATGTG TATTCAAAAG AAGTACTGAT TTGTAATTAT TATAGTTTGT GTGTATCATC 
CCCTTTTAAC CGTGCCTAAC AACTGTACTT AAATTTTGTT TTCCTAGTGT AACAAATGTT 
TCCCATAAGA TTTTCTAGAG CCAAATAATG GGAGTGAAAA ATTCCTTAAG TGTTATATAA 



GAAAATATAT TAGAAAATCA GCTTTGGATT ATACGATTTC TAAAATATAC TAATACAGAA 
TCCTCAGTAA TATGTTTTGA ATTC^ATTTT TTCTCAGAAC TGTTACATAA TAAATAATAC 
60 ATCAACCAGA AAAAAAAAAA AAAAAAATTN C 



60 
120 
130 
240 
300 
360 
420 
480 
540 
600 
660 
720 
780 
840 
900 
960 
1020 
1080 
1140 
1200 
1260 
1320 
1380 
1440 
1471 



WO 98/54963 





5 



(2) INFORMATION FOR SEQ ID NO: 22- 



(1) SEQUENCE CHARACTERISTICS: 



10 



I A) LENGTH: 1402 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : double 

(D) TOPOLOGY : linear 



(xi) SEQUENCE DESCRIPTION : SEQ ID NO: 22: 



15 AGGGACGTCT TGCCTGAGGA GATGCCCATT TCTGTCCTGG RTTACCCTCA CTGCGTGGTG 
CATGAGCTGC CAGAGCTGAC GGCGGAGAG? TTGGAAGCAG GTGACAGTAA CCAATTTTGC 
TGGAGGAACC TCTTTTCTTG TATCAATCTG CTTCGGATCT TGAACAAGCT GACAAAGTGG 



AAGCATTCAA GGACAATGAT GCTGGTGGTG TTCAAGTCAG CCCCCATCTT GAAGCGGGCC 
CTAAAGGTGA AACAAGCCAT GATGCAGCTC TATGTGCTGA AGCTGCTCAA GGTACAGACC 
25 AAATACTTGG GGCGGCAGTG GCGAAAGAGC AACATGAAGA CCATGTCTGC CAT'CTACCAG 
AAGGTGCGGC ATCGGCTGAA CGACGACTGG GCATACGGCA ATGATCTTGA TGCCCGGCCT 
TGGGACTTCC AGGCAGAGGA GTGTGCCCTT CGTGCCAACA TTGAACGCTT CAACGCCCGG 



CGCTATGACC GGGCCCACAG CAACCCTGAC TTC CTGCCAG TGGACAACTG CCTGCAGAGT 
GTCCTGGGCC AACGGGTGGA CCTCCCTGAG GACTTTCAGA TGAACTATGA CCTCTGGTTA 
35 GAAAGGGAGG TCTTCTCCAA GCCCATTTCC TGGGAAGAGC TGCTGCAGTG AGGCTGTTGG 
TTAGGGGACT GAAATGGAGA GAAAAGATGA TCTGAAGGTA CCTGTGGGAC TGTCCTAGTT 
CATTGCTGCA GTGCTCCCAT CCCCCACCAG GTGGCAGCAC AGCCCCACTG TGTCTTCCGC 



AGTCTGTCCT GGGCTTGGGT GAGCCCAGCT TGACCTCCCC TTGGTTCCCA GGGTCCTGCT 
CCGAAGCAGT CATCTCTGCC TGAGATCCAT TCTTCCTTTA MTTCCCCCAM CCTCCTCTCT 
45 TGGATATGGT TGGTTTTGGC TCATTTCACA ATCAGCCCAA GGYTGGGAAA GCTGGAATGG 
GATGGGAACC CCTCCGCCGT GCATCTRAAT TTCAGGGGTC ATGCTGATGC CTCTCGAGAC 
ATACAAATCC TTGCCTTTGT CAGCTTGCAA AGGAGGAGAG TTTAGGATTA GGGCCAGGGC 



CAGAAAGTCG GTATCTTGGT TGTGCTCTGG GGTGGGGGTG GGGTGTTTCT GATGTTATTC 
CAGCCTCCTG CTACATTATA TCCAGAAGTA ATTGCGGAGG CTCCTTCAGC TGCCTCAGCA 
55 CTTTGATTTT GGACAGGGAC AAGGTAGGAA GAGAAGCTTC CCTTAACCAG AGGGGCCATT 
TTTCCTTTTG GCTTTCGAGG GCCTGTAAAT ATCTATATAT AATTCTGTGT GTATTCTGTG 
TCATGTTGGG GTTTTTAATG TGATTGTGTA TTCTGTTTAC ATTAAAAAGA AGCAAAAATA 



60 
120 
180 
240 
300 
360 
420 
480 
540 
600 
660 
720 
780 
840 
900 
960 
1020 
1080 
1140 
1200 
1260 
1320 
1380 
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280 



ATAAAAAAAA AAAAAAAAAA CT 1402 



(2) INFORMATION FCR SEQ ID NO: 23: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH : 1047 base pairs 
10 (3) TYPE: nucleic acid 

(C) STRANDEDNE5S : double 

(D) TOPOLOGY : linear 



15 



35 



45 



(XI ) SEQUENCE DESCRIPTION : SEQ ID NO: 23: 

GGCACAGGGG ACTACAGGCA CCCACGACCA TACCCAGCTA ATTTTTGTAT TTTTTTGTAG 60 

AGATGGGGTT TCACGATGTC GCCCAGGCTG GTCTTGAACT CCTGGGCTTG AGCGATCTTC 120 

20 CCATCTTTCC ATCTTGGCCT C CT AAAGTGC TGGGACTGCA GGCATGAGCC ACCATGCCCA 180 

GCCAAGATTC TTATTGATTA CCATGTTGCT TCAAGAAGCC AAGCCAGTTT CCAATATTCC 240 

CCATTTGCTG GAGTCTTGGT ACTTTGGGTA GAAGCAACTG GTAAATTGTT AATTGGAACA 3 00 

25 

NTTGGTGGTG TAGATAACCA CGTATGGCCA AACCTAGAGC ATCTAGGCTC ACAATTACTA 360 

TCCTGACTTG ATAACAAGTG TTCTGATATT AACCTGAAAA TGGGAATAAT GCCAAATCTG 42 0 

30 TGTAACTTAA CATCTATATA CACAGTGGGG AGAACTGAAG TTATTAAACC TGGAATCTCT 480 

GTGATCAAGG CTAACAGTAG TTATCTAAGA AGCAAAGGAC CTACAATTCT TAGACTTGGA 540 

GTCATATTCT TTAAGGACGT GTTCTGAAAC TATATCAAGC ATCTGGTTTC CACGTATTTC 600 

TCCCTCAGAA ATTATGAAGT ACAAGTAAAA ATGAAGGTAC AGGGTAAGAC ACATGCTGCT 660 

TTCTTGCTCT TGAGTGGAGA CAGTTTTCCA GCCATCTTAA CCCCTTWACA CAAAACAATT 720 

40 TGTGTTTTAT AGCAAATAAG TGACTCAACA TAATTTCAAT ATGATGTTTA TCCACCAGTA 780 

CTTTCCTTTC AGCTTCTAGT CCCATAARTG GTTTGTGAAG TCATCGGTTA CATTAGCCAA 840 

GATAGGCCTA GACTTGAAGT C TAGAATGTT TTTCCCACTA TATGCCAAAG TAGAATGTGG 900 

GTATCTCAGG GTCATTTTTG TTGTTCAATT TCCCACCTGT ACAGTTGTTA TGATTCACTT 960 

TCCTTATGTG TCTAATAAAT CTTGTTCCAT GAAATGATCA AAAAAAAAAA AAAAAAAACT 1020 

50 CGAGGGGGGG CCCGGTACCC AAATCGC 1047 

55 (2) INFORMATION FOR SEQ ID NO: 24: 

(i) SEQUENCE CKARACTERI ST ICS : 

(A) LENGTH: 990 base pairs 

(B) TYPE: nucleic acid 
60 (C) STRANDEDNESS : double 
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:r<) TOPOLOGY: linear 
ixi) SEQUENCE DESCRIPTION: SEQ ID NO: 24: 

TTGGAAAGGG TCTAGCTCTT TCTCATTCAC CAACTATATT AGAAGCACTT GAGGGAAATT SO 

TACCACTCCA AATCCAAAGC AATGAACAGT CTTTTCTGGA TGATITTATT GCCTGTGTCC 120 

CAGGATCAAG TGGTGGAAGG CTTGCAAGGT GGCTTCAGCC AGATTCATAT GCGGATCCTC 180 

AGAAAACATC TTTGATCCTG GAATAAGGAT GATATTCGTT GTGGTTGGCC TACCACCATA 240 

ACTGTTGAAA CAAAAGACCA GTATGGGGAT GTGGTACATG TTCCCAATAT GAAGGTAATT 300 

15 ATAACTGGAT TAAATTAGCA GACATCTATA TACTGGCTGC AATGACTGAT AAAATTTTAG 360 

AAATGCCAAG TGCTGAGPGT CGATTTGTTC TACCCTCTTT ATATAAAGGG TGATGCTGAA 420 

AGTTTGTTTA AATGACTTGT TTATATTAAT TAGTCCCCAA GTGTCCAAGT TACACCTGTT 480 

TTTTTTGTGA GTTTGTTCTT TACATTTTGC TACCTGTTAC GGGGACTCAA AGGAGGGATA 540 

AGAAAGTATC CATCTAAAGA GTGCTAGACA CATACAGTGA AGCCCCTCAA TATGTATTGA 600 

25 TTGAATAAAT GCATGAAAGA ATACATTTTT AAATTTTGTG TATAGTTTTG AAAGACTGAA 660 

GTACGTTCTG TGTTTGGTAT TACTGAAACC ACATTTTAAA AATAACACTC ATTAAGTTAG 720 

AAATATATGA GTTTAGATTG TAAAAGAATG AGGAATTGAA ATAGTTGTAT AC C AT ATTGA 780 

30 

TGAATATAGA GTTTTTAGGA TACCTCTTAC CTGAAATATT AATAATAATG TTTNCAGAGC 840 

ATATTATACA TAATTATTTG TGATTTAATC TGTTAATATG AATATCTCAT TTAAAACTTT 900 

35 TATTTCTGAA AAAATTATAT TGAATAAAAT TTTATATAGG CAGTCCCCAG CCCTTTCCTC 960 

CTTCAAAGTT GTCTTATAGA GTGATTGGTT 9 90 



20 



40 



55 



(2) INFORMATION FOR SEQ ID NO: 25: 



(i) SEQUENCE CHARACTERISTICS: 
45 (A) LENGTH: 1208 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : double 

(D) TOPOLOGY : linear 

50 (xi) SEQUENCE DESCRIPTION: SEQ ID NO: 25: 

TAATCGCTAC TATAGGGAAA GCTGGTCGCT GCAGGTACCG GTCCGGAATT CCGGGTCGAC 60 

CCACGCGTCC GAGCGAAATG GCGCCTCCGG CCCCCGGCCC GGCCTCCGGC GGCTCCGGGG 12 0 

AGGT AG ACG A GCTGTTCGAC GTAAAGAACG CCTTCTACAT CGGCAGCTAC CAGGAGTGCA 180 

TAAACGAGGC GCASGGGTGA AGCTRTCAAG CCCAGAGAGA GACGTGGAGA GGGACGTCTT 24 0 

60 CCTGTATAGA GCGTACCTGG CGCAGAGGAA GTTCGGTGTG GTCCTGGATG AGATCAAGCC 300 
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25 



35 



50 



(2) INFORMATION FOR SEQ ID NO : 26: 



(i) SEQUENCE CHARACTERISTICS: 
40 (A) LENGTH: 1922 base pairs 

(B) TYPE : nucleic acid 

(C) STRANDEDNESS : double 

(D) TOPOLOGY: linear 

45 (xi) SEQUENCE DESCRIPTION : SEQ ID NO: 26: 

GTGCTGCGCT ACTGAGCAGC GCCATGGAGG ACTCTGAAGC ACTGGGCTTC GAACACATGG 
GCCTCGATCC CCGGCTCCTT CAGGCTGTCA CCGATCTGGG CTGGTCGCGA CCTACGCTGA 
TCCAGGAGAA GGCCATCCCA CTGGCCCTAG AAGGGAAGGA CCTCCTGGCT CGGGCCCGCA 
CGGGCTCCGG GAAGACGGCC GCTTATGCTA TTCCGATGCT GCAGCTGTTG CTC^ATAGGA 
55 AGGCGACAGG TCCGGTGGTA GAACAGGCAG TGAGAGGCCT TGTTCTTGTT CCTACCAAGG 
AGCTGGCACG GCAAGCACAG TCCATGATTC AGCAGCTGGC 7 AC 3TACTGT GCTCGGGATG 



CTCCIGGGCC ZCTGAGCTCC AGGCCGTGCG CATGTTTGCT GACTACCTCG CCCACGAGAG 360 

TCGGAGGGAC AGCATCGTGG CCGAGITGGA CCGAGAGATG AGCAGGAGCX 7'>GA?.GTGAC 420 

5 

CAACACCACC TTCCTGCTCA TGGCCGCCTC CATCTATCTC CACGACCAGA ACCC-3GATGC 430 

CGCCCTGC3T GCGCTGCACC AGOG 5 3 AC AG CCTGGAGTGC ACAGCCATGA CAGTGCAGAT 540 

10 CCTGCTGAAG CTGGACCGCC TGGACCTCGC CCGGAAGGAG CTGAAGAGAA TGCAGGACCT 600 

GGACGAGGAT GCCACCCTCA CCCA3CTCGC CACTGCCTGG GTCAGCCTGG CCAC3GGTGG 660 

TGAGAAGCTG CAGGATGC ZT ACTACATCTT CCAGGAGATG GCTGACAAGT GCTOGCCCAC 720 

15 

CCTGCTGCTG CTCAATGG3C AGGCGGCCTG CCACATGGCC CAGGGCCGCT GGGAGGCCGC 780 

TGAGGGCCTG CTGCAGGAGG CGCTAGACAA GGATAGTGGC TACCCRGAGA CGCTGGTCAA 840 

20 CCTCATCGTC CTGTCCCAGC ACCTKGGCAA GCCCCCTGAG GTGACAAACC GATACCTGTC 900 

CCAGCTGAAG GATGCCCAGA GGTCCCATCC CTTCATCAAG GAGTACCAGG CCAAGGAGAA 960 

CGACTTTGAC AGGCTGGTGC TACAGTACGC TCCCAGCGCT GAGGCTGGCC CAGAGCTGTC 102 0 

AGGACCATGA AGCCAGGACA GAGGCCAGGA GCCAGCCCTG CAGCCCTCCC CACCCGGCAT 108 0 

CCACCTGCAT CCCTCTGGGG CAGGAGCCCA CCCCCAGCAC CCCCATCTGT TAATAAATAT 1140 

30 CTCAACTCCA RGGTGTTC C A CCTGAAAAAA AAAAAAAAAA AAAAAAAAAA AAAAAAAAAA 120C 
AAAAAAAA 



1208 
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120 
180 
240 
300 
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TCCGAGTGGC CAATGTCTCA GCTGCTGAAG ACTCAGTCTC TCAGAGAGCT GTGCTGATGG 420 
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AGAAGC CAG A TGTGGTAGTA GGGACCCCAT CTCGCATATT AAGCCACTTG CAGCAAGACA 
GCCTGAAACT TCGTGACTCC CTGGAGCTTT TGGTGGTGGA CGAAGCTGAC CTTCTTTTTT 
5 CCTTTGGCTT TGAAGAAGAG CTCAAGAGTC TCCTCTGTCA CTTGCCCCGG ATTTACCAGG 
CTTTTCTCAT GTCAGCTACT TTTAACGAGG AC GT AC AAGC ACTCAAGGAG CTGATATTAC 
A7AACCCGGT TACCCTTAAG TTACAGGAGT CCCAGCTGCC TGGGCCAGAC CAGTTACAGC 



AGTTTCAGGT GGTCTGTGAG ACTGAGGAAG ACAAATTCCT CCTGCTGTAT GCCCTGCTCA 
AGCTGTCATT GATTCGGGGC AAGTCTCTGC TCTTTGTCAA CACrCTAGAA CGGAGTTACC 
15 GGCTACGCCT GTTCTTGGAA CAGTTCAGCA TCCCCACCTG TGTGCTCAAT GGAGAGCTTC 
CACTGCGCTC CAGGTGCCAC ATCATCTCAC AGTTCAACCA AGGCTTCTAC GACTGTGTCA 
TAGCAACTGA TGCTGAAGTC CTGGGGGCCC CAGTCAAGGG CAAGCGTCGG GGCCGAGGGC 



CNAAAGGGGA CAAGGCCTCT GATCCGGAAG CAGGTGTGGC CCGGGGCATA GACTTCCACC 
ATGTGTCTGC TGTGCTCAAC TTTGATCTTC CCCCAACCCC TGAGGCCTAC ATC CATCGAG 
25 CTGGCAGGAC AGCACGCGCT AACAAC C CAG GCATAGTCTT AACCTTTGTG CTTCCCACGG 
AGCAGTTCCA CTTAGGCAAG ATTGAGGAGC TTCTCAGTGG AGAGAACAGG GGCCCCATTC 
TGCTCCCCTA CCAGTTCCGG ATGGAGGAGA TCGAGGGCTT CCGCTATCGC TGCAGGGATG 



CCATGCGCTC AGTGACTAAG CAGGCCATTC GGGAGGCAAG ATTGAAGGAG ATCAAGGAAG 
AGCTTCTGCA TTCTGAGAAG CTTAAGACAT ACTTTGAAGA CAACCCTAGG GA.CCTCCAGC 
35 TGCTGCGGCA TGACCTACCT TTGCACCCCG CAGTGGTGAA GCCCCACCTG GGCCATGTTC 
CTGACTACCT GGTTCCTCCT GCTCTCCGTG GCCTGGTRCG CCCTCACAAG AAGCGGAAGA 
AGCTGTCTTC CTCTTGTAGG AAGGCCAAGA GAGCAAAGTC CCAGAACCCA CTGCGCAGCT 



TCAAGCACAA AGGAAAGAAA TTCAGACCCA CAGCCAAGCC CTCCTGAGGT TGTTGGGCCT 
CTCTGGAGCT GAGCACATTG TGGAGCACAG GCTTACAC CC TTCGTGGACA GGCGAGGCTC 
45 TGGTGCTTAC TGCACAGCCT GAACAGACAG TTCTGGGGCC GGCAGTGCTG GGCCCTTTAG 
CTCCTTGGCA CTTCCAAGCT GGCATCTTGC CCCTTGACAA CAGAATAAAA ATTTTAGCTG 
CCCCAAAAAA AAAAAAAAAA AAAAAAACTC GAGGGGGGGC CCGTACCCAA TTCGCCCTAT 



480 
540 
50 0 
660 
72C 
78C 
840 
900 
960 
1020 
1080 
1140 
1200 
1260 
1320 
1380 
1440 
150C 
1560 
1620 
1680 
1740 
1800 
1860 
1920 
1922 
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(2} INFORMATION FOR SEQ ID NO: 27: 



60 



(i) SEQUENCE CHARACTERISTICS : 

(A) LENGTH: 1951 base pairs 

(B) TYPE: nucleic acid 
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id S'TPJC^ED^SS : double 
{ D ) TO PC LOGY : I x ne ar 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 27: 

TCGTCCCCAG AGCGGGCTGA GCCCCAGGCG SAGGGTGGCG GGGGAGCCTG GGGGAGCCGC 

CGCCACCTCC ACGGGCCTCT CTGAGCTO3G AC AG CAGCGC CCTGTCCTAT GACTCTGTCA 

AGTACACGCT GGTGGTAGAT GAGCATGCAC AGCTGGAGCT G3TGAGCCTG CGCCGTGCTT 

CGGAGACTAC AGTGACGAGA GTGACTCTGO CACCGTCTAT GACAACTGTG G "TCCGTCTC 

CTCGCCCTAT GAGTCGGCCA TCGGAGAGGA AT ATG AG 3 AG GCCCCGCGGC CZCASCCCCC 

TGCCTGCCTC TCCGAGGAAG TCCACGCCTG ATGAACCCGA CGTCCATTTC TIG AAG AAAT 

TCCTGAACGT YTTCATGAGT GGCCGCTCCC GCTCCTCCAG TGCTGAGTCC TTCG3GCTGT 

TCTCCTGCAT CATCAACGGG GAGGAGCAG3 AGCAGACCCA CGGGGCCATA TT C AGGTTTG 

TGCCTCGACA CGAAGACGAA CTTGAGCTGG AAGTGGATGA GCCTCTGCTA GTGGAGCTCC 

AGGCTGAAGA CTACTGGTAC GAGGCCTACA ACATGCGCAC TGGTGCCCGG GG TGTCTTTC 

CTGCCTATTA CGCCATCGAG GTGACCAAGG AGCCCGAGCA CATGGCAGCC CTGGCCAAAA 

ACAGTGACTG GGTGGACCAG TTCCGGGTGA AGTTCCTGGG CTCAGTCCAG GTTGCCTATC 

ACAAGGGCAA TGACGTCCTC TGTGCTGCTA TGCAAAAGAT TGCCACCACC CGCGGGCTCA 

CCGTGCACTT TAACCCGCCG TCCAGCTGTG TCCTGGAGAT CAGCGTGCGG GGTGTGAAGA 

TAGGCGTCAA GGCCGATGAC TCCCAGGAGG CCAAGGGGAA. TAAATGTAGC CACTTTTTCC 

AGTTAAAAAA CATCTCTTTC TGCGGATATC ATCCAAAGAA CAACAAGTAC TTTGGGTTCA 

TCACCAAGCA CCCCGCGGAC CACCGGTTTG CCTGCCACGT CTTTGTGTCT GAAGACTCCA 

CCAAAGCCCT GGCAGAGTGC GTGGGGAGAG CATTCCAGCA GTTCTACAAG CAGTTTGTGG 

AGTACACCTG CCCCACAGAA GATATCTACC TGGAGTAGCT GTGCAGCCCC GCCCTCTGCG 

TCCCCGAGCC CTCAGGCCAG TGCCAGGACA GCTGGCTGCT GACAGGATGT GGCACTGCTT 

GAGGAGGGGC ACCTGCCACC GCCAGAGGAC AAGGAAGTGG GGCGCTGGCC CAGGGTAGGG 

GAGGGTGGGG CAATGGGGAG AGGCAAATGC AGTTTATTGT AATATATGGG ATTAGATTCA 

TCTATGGAGG GCAGAGTGGG CTGCC TGGGG ATTGGGAGGG ACAGGGCTTG GGGAGCAGGT 

CTCTGGCAGA GAAGGATGTC CGTTCCAGGA GCACACGGCC CTGCC C CATC CTGGGCCTTA 

CCTCCCCTGC CAGGGCTCGG GCGCTGTGGC TCCTGCCTTG ATGAAGCCCG TGTCCTGCCT 

TGATGAAGCG TGTGCCACCT GCAAGTGCCC GCCC7GCCCC TGCCCCAAOC GCCACCGAAG 

AGCCCTGAGC TCAGGCTGAG CCCAGCCACC TCCCAAGGAC TT7CCAGTGA GGAAATGGCA 

ACACGTGGAG GTGAAGTCCC TGTTCTCAGC TCCGTCATCT GCGGGGCTTC TGGGTGGCTC 
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430 
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CTGCCACTGA CCTCACCGGC APGCTGGCCT GT3GCAGGCC TAGGAC CT CA GGCGGGGAGG 1740 

AGGAGCTGCC GCAAGGCCCT STCCCAGCAG AAGAGGGA03 CTTCCTGACT GACACAGGCC 18 00 

5 

AGCCCCATCT TGGTCCTGTC ACCCTGGCCC CAACTATTAA AGTGCCATTT CCTGTGAAAA 1860 

AAAAAAAAAA AAAATCGGGG GGGGCCCGGA ANCCAATTTC CCCCAAAAAG GGGGGTTATA 1920 

10 AAAATTCCCN GGCNGTGTTT TTAAAAATTC G 195 1 



15 (2) INFORMATION FOR SEQ ID NO: 28: 

(i) SEQUENCE CHARACTERISTICS : 

(A) LENGTH: 3 999 base pairs 

(B) TYPE: nucleic acid 
20 (C) STRANDEDNESS : double 

(D) TOPOLOGY : linear 



(Xi) SEQUENCE DESCRIPTION : SEQ ID NO: 28: 



25 


GGCACAGGCC GCAGGGNACC 


TATGGGCGCA TATAGGTTGT 


AATGAAACTG 


TAG TCTCAGT 


60 




TGGAAGCCTA GACATGAAAT 


GGGTCAGTGA GCAAGGCTCT 


ATTCCTAGTC 


TCCAGCCATG 


120 




CCTGTGGAAC CTGARCCCRC 


TCTCAGCAC A TTGGAC CC AG 


GCAGATGYAA 


AAAATTCACA 


160 


30 














GAACTATGAT TTGGACTCAA 


GGGTTTGTAG ATTTCCTCCT 


TCATTCTAAT 


TTCAGTGTCT 


240 




AAAATTCTTG CATCCRTGAA 


CGAGCTGGGC ATTTGATGAG 


ACAGGGCYGA 


ATACTGCAGT 


300 


35 


TTTCCTCCTA GAAATCATCT 


GGGGCATTTT CTTTGAACTG 


ATGGGAACAA 


TAAGGCATAA 


360 




CTGTTTGCAC AAACTTGGGA 


TAARTGATTT TGGGATAACG 


ATCTACCAGA 


ATGGGGATAT 


420 




TTCACCCTTG GTTCTGAGAT 


GCAAACCAAA GAATATCATG 


ACCAGCTTTC 


AGGCCTCCTG 


480 


40 














AAGTATATCT CTCACATTGT 


CCTGTTCTCA TGCTGAGGAG 


CCTGAGATCC 


CTGTGTGGGG 


540 




ATTAGACAGT GGACTGTTAT 


GGGTGTAGGT GAATTGGCTT 


ATTTTGTCTG 


TCCCTGTCTG 


600 


45 


AATGTATTGC AGGAAYTAAA 


AAGGACCAAG AAGAGGAAGA 


AGACCAAGGC 


CCACCATGCC 


660 




CCAGGCTCAG CAGGGAGCTG CTGGAGGTAG TAGAGCCTGA AGTCTTGCAG 


GA.CTCACTGG 


720 




ATAGATGTTA TTCAACTCCT 


TCCAGTTGTC TTGAACAGCC 


TGACTCCTGC 


CAGCCCTATG 


780 


50 














GAAGTTCCTT TTATGCATTG 


GAGGAAAAAC ATGTTGGCTT 


TTCTCTTGAC 


GTGGGAGAAA 


840 




TTGAAAAGAA GGGGAAGGGG 


AAGAAAAGAA GGGGAAGAAG ATCAAAGAAG GAAAGAA.GAA 


900 


55 


GGGGAAGAAA AGAAGGGGAA 


GAAGATCAAA ACCCACCATG 


CCCCAGGCTC 


AGCAGGGAGC 


960 




TGCTGGATGA GAAAGRGCCT 


GAAGTCTTGC AGGACTCACT 


GGATAGATGT 


TATTCAACTC 


1020 




CTTCAGTTGT GTTGAACTGT 


GTGACTCATG CCAGCCCTAC 


AGAAGTGCCT 


TTTATGTATT 


1080 
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CAGACATAGG ATGGGTCAGT GGGCATGGCT CTATTCCTAT TCTCAAACCA TGCCAGTGO: 



TGCAGCACAT GCCGGGAGTG ATCAGTCRGA CATTTTAATT TGAACCACGT ATCTCTGGGT 
55 AGCTACAAAA TTCCTCAGGG ATTTCATTTT GCAGGCATGT CTCTGAGCTT CTATACCTGC 
TCAAGGTCAK TGTCATCTTT GTGTTTAGCT CATCCAAAGG TGTTACCCTG GTTTCAATGA 



2K6 



GGAGCAACAG CATOTTGGCT TGGCTGTTGA CATGGATGAA ATOGAAAA GT ACC AAGAAGT 

■GGAAGAAGAC CAAGAGCCA? CA7GCCCCA3 GCTCAGCAGG GAGCTGCTGG AT G AG AAAGA 1200 

GCCTGAAGTC TTGCAGGACT CACTGGATAG ATGTTATTCG ACTCCTTCAG GTTATCTTGA 1260 

AGTGCCTGAC TTAGGCCAGC CCTACAGCAG TGCKGTTTAC TCATTGGAGG AMCAXTACCT 1?20 

TGGCTCKKCT CTTGACGTGG ASAAATTGAA AAGAAGGGGA AGGGGAARAA AAGAAGGGGA 13 30 

AGAAGATCAA AGAAGGAAAG AAGAAGGGGA AGAAAAGAAG GGGAAGAAGA TCAAAACCCA 1440 

CCATGCCCCA GGCTCAGCAG GGAGCTGCTG GATGAGAAAG GGC CTGAAGT CTTGCAGGAC IS 00 

15 TCACTGGATA GATGTTATTC AACTCCTTCA GGTTGTCTTG AACTGACTGA CTCATGCCAG 1560 

CCCTACAGAA GTGCCTTTTA YKTATTGGAG CAACAGYGTG TTGGCTTGGC TGTTGACATG 1620 

GATGAAATTG AAAAGT AC C A AGAAGTGGAA GAAGACCAAG ACCCATCATG CCCCAGGCTC lb 80 

AGCAGGGAGC TGCTGGATGA GAAAGAGCCT GAAGTCTTGC AGGACTCACT GGATAGATGT 1740 

TATTCGACTC CTTCAGGTTA TCTTGAACTG CCTGACTTAG GCCAGCCCTA CAGCAGTGCT 1800 

25 GTTTACTCAT TGGAGGAACA GTACCTTGGC TTGGCTCTTG ACGTGGACAG AATTAAAAAG 1850 

GACCAAGAAG AGGAAGAAGA CCAAGGCCCA CCATGCCCCA GGCTCAGCAG GGAGCTGCTG 1920 

GAGGTAGTAG AGCCTGAAGT CTTGCAGGAC TCACTGGATA GATGTTATTC AACTCCTTCC 1980 

30 

AGTTGTCTTG AACAGCCTGA CTCCTGCCAG CCCTATGGAA GTTCCTTTTA TGCATTGGAG 2040 

GAAAAACATG TTXGGCTTTTC TCTTGACGTG GGAGAAATTG AAAAGAAGGG GAAGGGGAAG 2100 

35 AAAAGAAGGG GAAGAAGATC AAMGAAGRAA AGAAGAAGGG GAAGAAAAGA AGGGGAAGAA 2160 

GATCAAAACC CACCATGCCC CAGGCTCAAC GGCGTGCTGA TGGAAGTGGA AGAGCSTGAA 2220 

GTCTTACAGG ACTCACTGGA TAGATGTTAT TCGACTCCGT CAATGTACTT TGAACTACCT 2280 

GACTCATTCC AGCACTACAG AAGTGTGTTT TACTCATTTG AGGAACAGCA CATCAGCTTC 2340 

GCCCTTTACG TGGACAATAG GTTTTTTACT TTGACGGTGA CAAGTCTCCA CCTGGTGTTC 2400 

45 CAGATGGGAG TCATATTCCC ACAATAAGCA GCCCTTASTA AKCCGAGAGA TGTCATTCCT 2460 

GCAGGCAGGA C CI AT AGO" A MGTGAAGATT TGAATGAAAG TACAGTTCCA TTTGGAAGCC 2520 



2580 



AACCTGTGCT CAGTCTGAAG ACAATGGACC CACGTTAGGT GTGACACGTT CACATAACTG 2640 



2700 
2760 
2820 



60 



ACCTAACCTC ATTCTTTGTG TCTTCAGTGT TGGCTTGTTT TAGCTGATCC ATCTGTAACA 2 880 
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:A3GAGGGAT CCTTGGCTGA GGATTGTATT TCAGAACCAC CAACTGCTCT 7GACAATTGT 254 j 

TAACCCGCTA GRCTCCTTTG GTTAGAGAAG CCACAGTCCT TCAGCCTCCA ATTGGTGTCA 3CCG 

GTACTTAGGA AGACCACAGC TAG AT GGAG A AACAGCATTG GGAGGCCTTA GCGGTGCTGG 3C50 

TCTCRATTCC ATCCTGTAGA GAACAGGAGT CAGGAGCCGC TGGCA-^GAGA CAGCATGTCA 312 0 

CCCAGGACTC TGCCGGTGCA GAATATGAAC AAYGCCATGT TCTTGCAGAA AA:GCTTAGC 3180 

CTGAGTTTCA TAGGAGGTAA TCACCAGACA ACTGCAGAAT GTRGARCA'CT GA3CAGGACA 324C 

GCTGACCTGT CTCCTTCACA TAGTCCATRT CACCACAAAT CACACAACAA AAAGGAGARG 3 30C 

15 AGATATTTTG GGTTCAAAAA AAGTAAAAAG ATAATGTAGC T0:ATTTCTT TAGTTATTTT 3350 

GARCCCCAAA TATTTCCTCA TCTTTTTGTT GTTGTCATKG ATGGTGGTGA CATGGACTTG 3420 

TTTATAGAGG ACAGGTCAGC TGTCTGGCTC AGTGATCTAC ATTCTGAAGT TGTCTGAAAA 3480 

TGTCTTCATG ATTAAATTCA GCCTAAACGT TTTGCCGGGA AC ACTGCAG A GACAATGCTG 3540 

TGAGTTTCCA ACCTYAGCCC ATCTGCGGGC AGAGAAGGTC TAGTTTGTCC ATCASCATTA 3600 

25 TCATGATATC AGGACTGGTT ACTTGGTTAA GGAGGGGTCT AGGAGATCTG TCCCTTTTAG 3 660 

AGACACCTTA CTTATAATGA AGTATTTGGG AGGGTGGTTT TCAAAATTAG AAAT3TCCTG 3720 

TATTCCRATG ATCATCCTGT AAACATTTTA TCATTTATTA ATCATCCCTG CCTGTGTCTA 3730 

TTATTATATT CATATCTCTA CGCTGGAAAC TTTCTGCCTC AATGTTTACT GTGCCTTTGT 3340 

TTTTGCTAGT GTGTGTTGTT GAAAAAAAAA ACATTCTCTG CCTGAGTTTT AATTTTTGTC 3 900 

35 CAAAGTTATT TTAATCTATA CAATTAAAAG CTTTTGCCTA TCAAAAAAAA AAAAAAAAAA 3 960 

AAAAAAAAAA AAAAAGCGGA CGCGTGGGC 3989 



20 



30 



40 



(2) INFORMATION FOR SEQ ID NO: 29: 



(i) SEQUENCE CHARACTERISTICS : 
45 (A) LENGTH: 3735 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : double 

(D) TOPOLOGY: linear 

50 <xi) SEQUENCE DESCRIPTION: SEQ ID NO: 29: 

CTGCTGTTCG (CT03CTGGGC TCCGCAGCAG GCTTGGCCAG C3GCTGACGG GTCGGCGGGC 60 

GGGTTTGTGT GAACAGGCAC GCAGCTGCAG ATTTTATTCT GGTAGTGCAN CCCTCTCAAA 120 

55 

GGTTGAAGGA ACTGATGTAA CAGGGATTGA AGAAGTAGTA ATTCCAAAAA AGAAAACTTG 180 

GGATAAAGTA GCCGTTCTTC AGGCACTTGC ATCCACAGTA AACAGGGATA CCACAGCTGT 24 C 

60 GCCTTATGTG TTTCAAGATG ATCCTTACCT TATGCCAGCA TCATCTTTGG AATCTCGTTC 3 00 
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ATTTTTACTG 3CAAAGAAAT CCGGGGAGAA 
CAAATATTTT CAGAAGGACA TAGCTGAACC 

5 

TGAACCTCAG ATCAAAGACA TAAGTGAAGC 
AGTCAAAGCC TCTGTGGACA TGTTTGATCA 
10 TGAAACAACA AATAGTCTCT TGGATTTWTT 
TGATTACCAT TTTCAACAAA CTGGACAGTC 
ATCTAGGAGG AAAGCTGGTC ATCAGTTTGG 

15 

GAGAATCTTT TCTCTAATGC CAGAGAAAAA 
AATGGTGAAG CACCGAGCTT ATGAGCAGGC 
20 CAGACTCCAT GCTGATGTAT ACACATTTAA 
AAATGAGAAA TTTGAGGAAA AATGGAGTAA 
ACAGAAGGTG AAACCAAATC TTCAGACTTT 

25 

TCATGTGTTT GCAAGATCGC CAGCCTTACA 
AGAACCCTCG CTTGCAACAT ATCACCATAT 
30 TTTAAAGAGA TCATCCTTCA TCATTTATGA 
TTCTCCAAAG GACCCGGATG ATGATAAGTT 
TCTCAGAGAT CTAGAACTTG CCTACCAAGT 

35 

GAAATTCATT GGACCTGATC AACATCGTAA 
TTGTCTAATG GAACAAATTG ATGTTACCTT 
40 CTACTTTCCC CACTCCCAAA CAATGATACA 
GCTAGAAGTG ATTCCTAAAA TTTGGAAAGA 
TGACCTGAGA GAAGAGATCC TGATGCTCAT 

45 

GGTGGCATTT GCTGACTGTG CTGCTGATAT 
ACAGACTGCT CAGGATTGGC CAGCCACCTC 
50 GGCTGGGAGA ACTCAGGAAG CCTGGAAAAT 
TCCTAGAAGT GAGTTGCTGA ATGAGCTTAT 
CCAGGCCATT GAAGTAGTAG AGCTGGCAAG 

55 

CACCCAGAGA GTAATGAGTG ATTTTGCAAT 
TCTAACTGCA TTGACCAGTG ACAGTGATAC 
60 CAGTGAAGGC AAATGAAAGT GGAGATTCAG 



TXjTGGCCAAG TTTATTATTA ATTCATACCC 360 

TC AT AT AC CG TGTTTAATGC CTGAGTACTT 42 C 

CGCCCTGAAG •GAACGAATTG AGCTCAGAAA 48 0 

GCTTTTGCAA GC AGGAAC C A CTGTGTCTCT 540 

GTGTTACTAT GGTGACCAGG AGCCCTCAAC 600 

AGAAGCATTG GAA3AGGAAA ATGATGAGAC 66C 

AGTTACATGG CGAGCAAAAA AC AACGCTG A 7 20 

TGAACATTCC TATTGCACAA TGATCCGAGG 7 80 

ATTAAACTTG TACACTGAGT TACTAAACAA 840 

TGCATTGATT GAAGCAACAG TATGTGCGAT 900 

AATACTGGAG CTGCTAAGAC ACATGGTTGC 96 0 

TAATACCATT CTGAAATGTC TCCGAAGATT 1020 

GGTTTTACGT GAAATGAAAG CCATTGGAAT 1080 

TATTCGCCTG TTTGATCAAC CTGGAGACCC 1140 

TATAATGAAT GAATTAATGG GAAAGAGATT 1200 

TTTTCAGTCA GCCATGAGCA TATGCTCATC 1260 

AC ATGGCC TT TTAAAAACCG GAGACAACTG 1320 

TTTCTATTAT TCCAAGTTCT TCGATTTGAT 1380 

GAAGTGGTAT GAGGACCTGA TACCTTCAGC 1440 

TCTTCTCCAA GCATTGGATG TGGCCAATCG 1500 

TAGTAAAGAA TATGGTCATA CTTTCCGCAG 1560 

GGCAAGGGAC AAGCACCCAC CAGAGCTTCA 1620 

CAAATCTGCG TATGAAAGCC AACCCATCAG 1680 

TCTCAACTGT ATAGCTATCC TCTTTTTAAG 1740 

GTTGGGGCTT TTCAGGAAGC ATAATAAGAT 1800 

GG ACAGTGC A AAAGTGTCTA ACAGCCCTTC 1860 

TGCCTTCAGC TTACCTATTT GTGAGGGCCT 1920 

CAACCAGGAA CAAAAGGAAG CCCTAAGTAA 1980 

TGACAGCAGC AGTGACAGCG ACAGTGACAC 2040 

GAGCAGCAAT GGTCTCACCA TAGCTGCTGG 2 ICO 



WO 98/54963 
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AATCACACCT GAGAACTGAG ATATACCAAT ATTTAACATT GTT AC AAA.3 A AG AAAAGAT A 2160 

CAGATTTGGT GAATTTGTTA CTGTGAGGTA CAGTCAGTAC ACAGCTGACT TATGTAGATT 22 2 'J 

5 

TAAGCTGCTA ATATGCTACT TAACCATCTA T7AATGCACC ATTAAAGGCT TAGCATTTAA 22 SO 

GTAGCAACAT TGCGGTTTTC AGACACATGG TGAGGTCCAT GOGTCTTGTC ATCAGGATAA 2 34 ] 

10 GCCTGCACAC CTAGAGTGTC GGTGAGCTGA CCTCACGATG CTGTCCTCGT GCGATTGCCC 240) 

TCTCCTGCTG CTGGACTTCT GCCTTTGTTG GCCTGATGTG CTGCTGTGAT GCTGGTCCTT 2460 

CATCTTAGGT GTTCATGCAG TTCTAACACA GTTGGGGTTG GGTCAATAGT TTCCCAATTT 2 52 D 

15 

CAGGATATTT CGATGTCAGA AATAACGCAT CTTAGGAATG AC T AAAC AAG ATAATGGCAG 2530 

TTTAGGCT GC ACAACTGGTA AAATGACTGT AGATAAATGT TGTAATTAGT GTACACGTTT 2640 

20 GTATTTTTGT TAATATAGCC GCTGCCATAG TTTTCTAACT TGAACAGCCA TGAATGTTTC 2700 

ATGTCTCCCT TTTTTTTTTG TCTATAGCTG TTACCTATTT TAGTGGTTGA AATGAGAGCT 2760 

AGTGATGACA GAAGGATGTG GAATGTCTTC TTGACATCAT TGTGTATTGC TGGTAATCAA 2 820 

25 

GTTGGTAACG ACTACTTCTA GCAGCTCTTA CCACTATGAC TTAAGTGGTC CTGGAAGGCA 2 880 

GTAAGTGGAG GTTTGCAGCA TTCCTGCCTT CATGAGGGCT TCTACCACTG AC C ACTTTGC 2940 

30 ACGTACCTGG CTCCCAGATT TACTTAGGTA CCCCACGAGT CGTCCACATA AGCAGCTTCA 3 0C0 

TCTTTACCTT GCCAGAGTTG ACAATTATGG GATACTCTAG TCTACTTATA CTTGTGTTCC 3060 

CATCTGTCTG CCATCCTCTG AAGGCCAGGA CCCAGTCATA CATCCTTAGA AACCAAAGTA 3120 

35 

TGGTTTTTGT TTTCTCTTGG AATGTCAGGT CTT AAGGC AT TTAATTGAGG GACAAAAAAA 3180 

AAAAAAAGCC GATATAGTAG CTAGCTACTT AAGCATCCAT GGGTATTGCT CCATATCAAA 3240 

40 GCAGATTTGC AGGACAGAAA GAGTAAATTA GCCTTCAGTC TTGGTTTACA GCTTCCAAGG 3300 

AGAGCCTTGG CCACCTGAAA TGTTAACTCG GTCCCTTCCT GTCTCTAGTT CATCAGCACC 3360 

TGCAGATGCC TGACTCTTGT TAGCCTTACT ATTCAATACA GTCCTTAGAT TCACGGTATG 3420 

45 

CCTCTTCCTA TCCAGGCACC TATTCTGAAT CACCATGTTG CTCTGCAGCT AGAGTTGATA 3480 
GGAGAAAATC CATTTGGGTA GATGGCCTAT GAATTTGTAG TAGACTTTCA AAATGAGTGA 3540 
50 TTTGTTAGCT TGGTACTTTT AAGTTTGTGG TACAGATCCT CCAAACCCAT ACTCTGAGCA 3 600 

ATTAACTGCC TTGAACATAG AGAAAATTAA GGCCTCACAG GATGAGTCTC CAT7CTCTGT 3660 
AAATGCTTAT TTTATCATAG TCTTTAGCCN CTACTATGAG TAAAATGTTC TCTTCNGCCG 3720 

55 

GGTGTGGTGA CTCAC 3735 



60 
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10 



290 



INFORMATION FOR SEC. 10 NO: 30: 

<ij SEQUENCE CHARACTERISTICS : 

(A; LENGTH: 166^ base pairs 
(B) TYPE: nucleic acid 
(O STRAHDEDNESS : double 

( d ) topology : linear 

(XI) SEQUENCE DESCRIPTION: SEQ ID NO: 30: 

TAGTAATTCA TTTAACTCCT CTTACATGAG TAGCGACAAT GAGTCAGATA TCGAAGATGA 60 

AGACTTAAAG TTAGAGCTGC GACGACTACG AGATAAACAT CTCAAAGAGA TTCAGGACCT 12 C 

15 GCAGAGTCGC CAGAAGCATG AAATTGAATC TTTGTATACC AAACTGGGCA AGGTGCCCCC 18 0 

TGCTGTTATT ATTCCCCCAG CTGCTCCCCT TTCAGGGAGA AGACGACGAC CCACTAAAAG 240 

CAAAGGCAGC AAATCTAGTC GAAGCAGTTC CTTGGGGAAT AAAAGCCCCC AGCTTTCAGG 30 C 

TAACCTGTCT GGTCAGAGTG CAGCTTCAGT CTTGCACCCC CAGCAGACCC TCCACCCTCC 360 

TGGCAACATC CCAGAGTCCG GGCAGAATCA GCTGTTACAG CCCCTTAAGC CATCTCCCTC 420 

25 CAGTGACAAC CTCTATTCAG CCTTCACCAG TGATGGTGCC ATTTCAGTAC CAAGCCTTTC 480 

TGCTCCAGGT CAAGGAACCA GCAGCACAAA CACTGTTGGG GCAACAGTGA ACAGCCAAGC 540 

CGCCCAAGCT CAGCCTCCTG C CATG ACGTC CAGCAGGAAG GGCACATTCA CAGATGACTT 600 

GCACAAGTTG GTAGACAATT GGGCCCGAGA TGCCATGAAT CTCTCAGGCA GGAGAGGAAG 660 

CAAAGGGCAC ATGAATTATG AGGGCCCTGG AATGGCAAGG AAGTTCTCTG CACCTGGGCA 720 
35 ACTGTGCATC TCCATGACCT CGAACCTGGG TGGCTCTGCC CCCATCTCTG CAGCATCAGC 



20 



30 



40 



50 



TACCCCATTT GGCGCTCAAT GGAGTGGGAC GGGTGGCCCA GCACCACAGC CACTTGGCCA 
GTTCCAACCT GTGGGAACTG CCTCCTTGCA GAATTTCAAC ATCAGCAATT TGCAGAAATC 



45 TAGATCTGGG GGCAGGAGAT GGAATGCTGA GGGGGTGGGT GGGGGTGGGA AGTAGCCTAT 



780 



TACCTCTCTA GGTCACTTCA CCAAGTCTAT GTGCCCCCCA CAGCAGTATG GCTTTCCAGC B40 



900 
960 



CATCAGCAAC CCCCCAGGCT CCAACCTGCG GACCACTTAG ACCTAGAGAC ATTAACTGAA 1020 



1080 



ATACTAACTA CTAGTGCTGC ATTTAACTGG TTATTTCTTG CCAGAGGGGA ATGTTTTTAA 1140 

TACTGCATTG AGCCCTCAGA ATGGAGAGTC TCCCCCGCTC CAGTTATTGG AATGGGAGAG 1200 

GAAGGAAAGA ACAGCTTTTT TGTCAAGGGG CAGCTTCAGA CCATGCTTTC CTGTTTATCT 1260 

ATACTCAGTA ATGAGGATGA GGGCTAGGAA AGTCTTGTTC ATAAGGAAGC TGGAGAACTC 1320 

55 AATGTAAAAT CAAACCCATC TGTAATTTCG AGTGGGTGGA GCTCTTGCTT TTGGTACATG 1380 

CCCTGAATCC CTCACTCCCT CAAGAATCCG AACCACAGGA CAAAAACCAC CTACTGGGCT 1440 

CTCTCCTACC CTGCCCTCCT CCCTTTTTTT TACCCCTCTC TTTTTTATTT TTTCTTTGCT 1500 

60 



WO 98/54963 



ICT7LS98/1 1422 



CTTTACAACC C AGTG AAAAA TACCAGGGTA 
ATTAGTGCTT TAA3CAAAA3 AT ATT AO JAG 
5 AAAAAAANWA AAAACTCGAG GGjGGGCCCG 



291 

CTGGGGTGCA ACTCTTTCTT ATGATAGGTC 156 J 

CTTTGACTGC AGCATTAGCA ATTAGGRAAA 162C 

GTTACCCAAT TCGCCCT 1667 



25 



35 



45 



55 



300 
360 



10 (2) INFORMATION FOR SEQ ID NO : 31: 

(i) SEQUENCE CHARACTERISTICS : 

(A) LENGTH: 1408 base pairs 
(E) TYPE : nucleic acid 
\$ (c) STRANG EDNESS : double 

(D) TOPOLOGY : linear 

<xi) SEQUENCE DESCRIPTION: SEQ ID NO : 31: 

20 ATTACACACC TGAGCACTGT GCCTGGCAAG ACCTGTCTTA ATAGATTAGA GAACCACTGA 60 

TAGATGGTCA GCTTTCTGTA GCAGTGAGAA CCCTACATTT CAAATGTGGA TAGCACCTTT 12 0 

GCGGGGAAAC ATCACTTGGC ACATCTGCAT TCTTTTTTGA CACAGGGTCT CACTCIX7TTG 180 

CCCAGGCTAG AGTGCATGGC ACGATCTTAG CTCACTGCAA CCTCCACCTC CCAAGTTC AA 240 

GCGATTCTTC TGCCTCAGCC TCCTGAGCAG CTGGGATCAC AGACATGCGC TACCATGCCC 

30 AGCTAATTTT TTGTATTTTT TGTKTGTTTG TTTTTGTTTK TAAGTAGAGA CGGGCTTTCA 

CCACGTTGGS CAGGCAGGTC TCGAACTCCT GAMCTCAGGT GATCCACCCA CATCTGCGTT 420 

CCAATATCTT TCTCAACATA ATGATAGCCG TAATTAATAT TTTCCAGTAC ATTTTTATGC 480 

CTTTACACAC GAGAGTGGTA GACAGACACA AACCCAGATC TGTCTGACTC CAAAGCCCGT 540 

TTGTCATCAT TCCTTTTACG GTATCCTATA GTGGTATCCT TTACAGAAAG ACAGCTTTTA 600 

40 CCCAACAAAG ACTTAACTTC CCAGGATGCC AGAAGGACAA AGCGGGATTG CTTTTAAGRA 660 

GRAAGTTATC AAGAMCTTAT TTTATAAATG AGATTAGATA GGGAAAGGCA ATTTATCTTT 720 

ATTAAAAACT GAAAAGGCCA GCATAGGGAA GGAGGTCCTT CGGTGGTCTT TTTCAGGGAA 7 80 

ATACTTCAGT TCCTTTTATT AGAAACAGAT AGTACCTAAG GTTTTGAGGT AGGWACAGCT 840 

TAAGGCATGC TAATGKTCAT GGGTCCTTCC ATAGTCATTT TKGTATTTTG GTTWACATTT 900 

50 GAGCAATAGG CAGCCCTTCA CTCCTCCTGG AYTCATTCCT GCCAYTATTA CAGGTGACAG 960 

AGGAGACAGG AGGTATGTCT TTTCTATTTT TAWACATGCT TTATATTTAA CACAAGCTCT 1020 

TGGGTATCTT AGATAAACAG AAGTTGCCTA GCACTCCTTT TAGTGCATTG AACCCTTTAA 108C 

CATTTAAGCA AAATAATAAA CAGTCTTTTG AGGTTCCTTA ACAATGAAAC GTCTTCGAGT 1140 

GGCAGCAGCG GAATCCATGC YTCTTCTCCT GGAGTGTGCA AKAGTCCGTG GTCCTGAGTA 1200 

60 TCTC AC AC AG ATGTGGCATT TTATGTGTGA TGCTCTAATT AAGGCCATTG GTACAGAACC 12 6C 



W O 98/54963 




10 



20 



30 



40 



XTC-.G AAA.TAATOCA TTCTTTTGCA AAGCTGAATA TTTTTCTCTT 132 3 

-~<TGG TA7GTTCATT TATTAGTCTT GCTAAAAAAA AAAAAAAAAA 1330 



50 



(2) iotcsmatic:! ec?. seq :d no: 32: 



! i ) SZ£UErCE :TKA? ACTZRI 3TICS : 

(A) : SvGT":-:: 2:21 base pairs 
15 (3) TYPE: r.ucleic acid 

(C) STRAN2EHNZS3 : double 

(D) TOPCLCGY : linear 



14C8 



(xi) SSQUZTrfCE DESCRIPTION: SEQ ID NO: 32: 
AGGATATGCA TGATTCTTAA CCAGGCTATA TGTTAAAAAA AAATTGGAAA ATGCAATACA 60 
'•'I Ti ' lT A-CTA TACAAACTAC AGAATGAGTA TGCAAGTTTT ATTTATCAAA ATGTAATGGA 
25 TTTTTAAAGG CTGAGA A ATT TTCCTTATAC CTACCTTTTC AGTTATTTTA ATTATACCAA 
ATTATCAACT AGAATAGCTT CATCCATATG AAATATAAAA TGAAGAGACA CCTAGGCTCT 
ATCAGGCTTA C^-TTCTTTG AACTTATTTC CACTTTAATT TCTCAGTGGA AGTTAAGAGG 
GGTGAGA-AAA CAAAGAAGGG GAAAAACTGA CAACTAACAA AACCAGCACC ACATCGCTAG 
3TGGTGC7TA CTAATTACCT TCTCAGGATT TTCCTCAGAT TGAAAAGCTT ATGAGGATTT 
35 CTTGGGA.3TC TTAATAACCT GCCTGTTAGT ACAGAGCTTT CCTGATGATA TTTACTCTTG 
AGCACATGTG GTTGT AAAAC CTTAACTTTC TTTCTCCAGG AGGGTGGTGA TAGAAACAGA 
rGGTAGTA.1T TATGAACTGA TGTTCTCGTG AAATGTTGAG GGTGGGGAGA AAAGACTTTA 
AGGGAGGAGA GCCATCTATT TTGTTCCTAA AGCCACCTCT CAGCAGAATC GTCATGTTTT 
TCTGATGCAC CGCTCTC-CTT CATGCCCAAG ATGACTTGCG AGGCAATCTC AGGAGCTGTG 720 
45 GACTTAACCR TTGCAAA.GCA CACTGTCTTT CTCAGCGTTC TCTGCAAGTC AGTAGGTGTT 780 
AGTATGGTTG CAAAGTTCAC TGTCTCAGCA AAGTTGAACT GGGCTACCTG TCTACAGCTG 
TTTCCTCAGA GGGAAAAATC TTGAGACCAG ATGGTGGAGC TCTGGAGTCA. GAGGAAATGG 
GTGTCTTCAG CACAAAJGCTG CTGCTTTTAC TTCAGCCACT TCTGACATTT TTACATACCG 
AGCCTGAGAT TRTGTGATTA TCTCAAATCA AATCACTTTG ATGGAGATAA ATAATCAAAA 
55 CTGTTTTATA GTCATTGATT TGGTGAGAAC AGTAATGGAA AATGGTGTTG AAGGACTTCT 
CA TTTTTG GA GCTTTCCTTC GAGAGTCCTG GCTGATTGGT GTTCGCTGTT CATCTGAGCG 
C^AAAAGCA TTATT ACTGA TACTTGCACA CAGTCAAAAG CGCAGACTGG ATGGATGGTG 

60 



120 
180 
240 
300 
360 
420 
480 
540 
600 
660 



840 
900 
960 
1020 
1080 
1140 
1200 
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293 
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TTTTATAAGG CATTTAAGGG TACACTACTG TGTTTCACTG ACCATACATT TTTCTTAGCC 



L26C 



10 



CCTCAAGTAA TATAGCACAG AGTTATGAAT GACAATTCCC CTAACCATTv. CTCTT'GATAT 1320 

CTGCCTCTTC CCCTTACCAT CGTAATTCTC CAAACTGGTC ATAAAGGCAC TCTGTGAAGA 13 so 

TATTGGGGAC TGACATCTTA AGCTCTCACC TGGCTGCAGT AGGAAAGGCC AAACTGACGA 144 0 

CAAAAAAAAA ATTCTTTATA AAGATGATAT GGTAACATGT ATCTTTGCCC TGGGTCTGGG 1500 

TGGGTCCAGT CAGTCTCAGA TTTACAAGCA TTTAGGAGCC TAGGTAAAAG GTGCTAGTAT 156C 

TGTTTTAAAA GTTACATTTA TGACTTGCAA TGATAGAAAA CTCCTTCCAA TTAAATGGCA 1620 

15 TTTTATAATA TTATGTGTGT ACTTCACAGT GTTAAAAATA CCCTCATACG TTATTGCATT 1630 

TGATCTTCAC AGAAAG TGC A TTTTAACCAG TACTCTGGGT GCAATAAATA ATATGTAGAA 1740 

ATTTAAGTCC TCCAATTCCA GCATATCCAG TGAGTTTTGA CAGTGTGTTT ATGTGGAATG 1300 

TTTAAGGATA TACAATTGTA CTTTATATAA ATTGGTTCTT GTTCTTCTTA AATGTG AC AT 1860 

GAAATAATTG TGCTGCTACA TTATACTGGA AATTAACAG3 GGAAAAGGGA AGAGCTCTTG 1920 

25 GCTCCCTTGA GGTTCTGCTA GTGGTGTTAG GAGTGGTTAC AAC TGAGCTT TTAGTAACCA 1980 

TTT AAC CGT A TGTAAACTTG GTTTCTAATT AAAAAAAAAT TTCTTTTTCC A 2031 



20 



30 



45 



55 



(2) INFORMATION FOR SEQ ID NO: 33: 



(i) SEQUENCE CHARACTERISTICS : 
35 (A) LENGTH: 971 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: double 

(D) TOPOLOGY: linear 

40 (xi) SEQUENCE DESCRIPTION: SEQ ID NO: 33: 

CGCGTCGGAA CTCGGCCGCG GGACATCCAC GGGGCGCGAG TGACACGCGG GAGGGAGAGC 60 
AGTGTTCTGC TGGAGCCGAT GCCAAAAACC ATGCATTTCT TATTCAGATT CATTGTTTTC 120 
TTTTATCTGT GGGGCCTTTT TACTGCTCAG AGACAAAAGA AAGAGGAGAG CACCGAAGAA 
GTGAAAATAG AAGTTTTGCA TCGTCCAGAA AACTGCTCTA AGACAAGCAA GAAGGGAGAC 
50 CTACTAAATG CCCATTATGA CGGCTACCTG GCTAAAGACG GCTCGAAATT CTACTGCAGC 

CGGACACAAA ATGAAGGCCA CCCCAAATGG TTTGTTCTTG GTGTTGGGCA AGTCATAAAA 360 
GGCCTAGACA TTGCTATGAC AGATATGTGC CCTGGAGAAA AGCGAAAAGT AGTTATACCC 42 0 

CCTTCATTTG CATACGGAAA GGAAGGCTAT GCAGAAGGCA AGATTCCACC GGATGCTACA 480 
TTGAT T T TT G AGATTGAACT TTATGCTGTG ACCAAAGGAC CACGGAGCAT TGAGACATTT 54 0 

60 AAACAAATAG ACATGGACAA TGACAGGCAG CTCTCTAAAG CCGAGATAAA CCTCTACTTG 60 0 



180 
240 
300 
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204 



CAAAGGGAAT TTGAAAAAGA TGAGAAGCCA CGTGACAAGT CATATCAGGA TGCAGTTTTA 660 

GAAGA7ATTT TTAAGAAGAA TGACCATGAT GGTGATOGCT TCATTTCTCC CAAGGAATAC 12 C 

AA7GTATACC AACACGATGA ACT AT AGC AT ATTTGTATTT CTACTTTTTT TTTTTAGCTA 730 

TTTACTGTAC TTTATGTATA AAACAAAGTC ACTTTTCTX AAGTTGTATT TGCTATTTTT 840 

10 CCCCTATGAG AAGATATTTT GATCTCCCCA ATACATTGAT TTTGGTATAA TAAATGTGAG 900 

GCT3TTTTGC AAACTTAAAA AAAAAVJWAAA AAAACT SG AG GGGGGCCCGT ACCCAANTCG 960 

971 



15 



20 



30 



40 



50 



CCGNATATGA T 



(2) INFORMATION FOR SEQ ID NO: 34: 



U) SEQUENCE CHARACTERISTICS : 

(A) LENGTH: 1792 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : double 
25 (D) TOPOLOGY: linear 



TGTCAGTTTA GAAATGGACT GGATAAAACT TACTTGGTTG TCATTATTTT ATCTCATTTG 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 34: 

GAACCCCCTT TCTCCTGGTA AAGGGTAAGG GGGGGGATAA TGTTTACCAC AGGTACGAAA 60 

TAGTCACTTT AACATTGAGA CCTCTGCCTC ATTGAATTCA GGTTTTTTAA GTACTTGAAA 120 

CTGTTCAGAT TCTCCTTATT TTAGTTTCTT TTTACATTTA TGAAGTAGAA AGCATTGTTT 180 

35 TGTAAACTGT TTTGAAAATA AATAGCCTAG TCTCTTATCC TCTTTAGCGT GGATTAAAGG 240 

TGAAGTTCTG CAAATGGGAG AGTGTTCACA GTAGATAGCT CAGATTGATT GAACACATTT 300 

GAGGAAGAGA CTCCTGCATG AGATACCAGC ATTTTTACAA ATACTTTTTA TGTACATTCT 3 60 

TTATTTTGTC ATTTTGTCAA CCCTCTCCCC AAGCACATCT TCTTTCCTTT TACTATGTCT 420 

ATGTAGGGAA AAACAAAACA AAAAATTGCA CTTACGTTAC ACTCCCAAAA TGTGGGTAAT 480 

45 CCGTGTCTTT CAAAAAACAT TTCTGTTTTT TGTTTTGTTT TGGTCAGTCC ATTGCATAAG 540 

TGACAAGTTT GGGTGCTTGT GGCACGTATG TATGAAGCGG GAGGGGGATG ASAATTGCCT 600 

GTCCTTCAGT ARGCTGTAAA AGTAATTTAC ATGTAAGTAA AAAGGGAAAA TAGAATAGAT 660 

GCCAAAGTCA TTTATTCAGT CCTTAGTTTT CTTATGTGGC ATTACTGCAT CTGCTAGTTA 720 
GTGAGAAAGC ACCCTCAGCT TTTACTGCTC CCCTCCCTGC CTGCCAACAC ACTTGATGTG 



780 



55 TGCAAACAGC CCTCAAGTAT CTGTCAGATG ACCTATATAA GGTATTGAAT AAGGTATTCT 840 



900 



60 



TCCTGTTACA TGCCCTATGT TAAGATAATT ATATTGCCAC TAATAATCAA GATGCTAAAT 960 



WO 98/54963 
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10 



GAGTATTACA ACTGGC7AAT ATCATTTTTT ATATACAAGG GTATGTGTA7 A7TTGGAA77 1212 

GRTATGAGAA ACTCATTTGT ACCCATTTGA GTGATATTSC ACAACAAACA CAGATAYC7A 103 3 

CAGACTCCGT TTTCATTTTC TCGTGTTCTT TATGATAA7G ATCTTTGTAG A7TGGTTATT 114 0 

TCTGTACTTT ATCTGTAATA AA'CTTTGTAG ATCCTGTGAA CCATTACTTT GC Z T AAA7CA 12C0 

CTTGAGACTT GAGTCTTTAA TAACAAAGCA TCAATATTCA CTAAAGTCAA TCTCTTTTGA 1260 

GTTTCTGTGA CTTGGCTAGA AGCTCTTGAC ACTAAGGGAT TAGTGTTAAT TTTCCCTGGG 132 0 

GGTGTTCCAC TAG3GCATTA CTG7ATAA7G ACTTGATGTT GCCACATAGA CTTCAAGATA 13 30 

15 TATAATATTT TGA3GATTTT GTTGATTGGC CTATGTTTTA TTGCATAGTG TGAAACGTGT 1440 

AAAGCTTGGT TAACCTGTAT AT AGAT AGC T TATTGTTGAC TAGTTATAGT GTATTTAGGG 15(30 

TTGCCTGTAA TATTTAAGCT TCTTTACTGA TGTGTGTGCT GGTAGGAACA TATAATTTTT 1560 

20 

GT AC ATT ATA TTTACTGAGA TGTTGCCTTT TTT ATTTT AC AAATACTTTG GAATTCCAAT 1620 

GTGTTTTTTG CTTCCGTGAG GATTAATTTG GAAAGGTTTT TAATGACATT CCACTGATTT 1680 

25 CAGATTTTGC TTGAGATTGA CTTCAATAAA TTGTCCTGTA TGTTCCAAAA AAAAATTAAA 1740 

AAACTCGAGG GGGGCCCGGT ACCCAANNCG CCGGATATGA TCGTAAACAA TC 1792 



30 



45 



55 



(2) INFORMATION FOR SEQ ID NO: 35: 



(i) SEQUENCE CHARACTERISTICS: 
35 (A) LENGTH: 896 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : double 

(D) TOPOLOGY: linear 

40 <xi) SEQUENCE DESCRIPTION: SEQ ID NO: 35: 

AGTTGNANAC AACAGGACCT GAGTCCTTGG GCAGCACCAG TAGGTTGCCC CYTGCYTCYT 60 
GCCAGCYTCA CYTGCCACYT TYTGCCCCTY TCGGGATGCC TTCGCAGACA GAGYTYTTCG 120 
CTGCCTGTGG TGGCCAYTCT TTGCTTTTGG TTYTCTTGCC CCITGGCCTC CCTTT7TGTC 
CCCGGGCAGC CTTGTGTGAC CTGCCCTTTT CCCTCCCTTC CTTTCCAGGA CAAGCACGCC 
50 GAGGAGGTGC GGAAAAACAA GGAGCTGAAG GAAGAGGCCT CCAGGTAAAG CCTAGAGGCC 
AAAGAACTTT CCAGGTCAGC CGGACAGCTC CAGCAGCTCC ACGTTCCAGG CAGCCTCGMC 
CGCCGGCTGC GCTCCCAGCA CT303GTTTG GGGGGAGGGG GG7GGCCAAG GGGCGTTTCC 
TCTGCTTTTG GTGTTTGTAC ATGTTAAGAA TTGACCAGTG AAGCCATCCT ATTTGTTTCC 



180 
240 
300 
360 
420 
480 



GGGGAACAAT GACGGGGTGG GARAGGGGAG AGGAGAGAGT TTGGGAAAGG GAGATGGAGA 54 0 



60 AC^CTCAAG GACATTGCAA CCCTGCCCGG CGCAGATCTG ATTTTCACAT CTCTACCTGG 



600 
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ACATTGAGCC TCCCAGGCAC CATGTTGAGG AGAGATGAAA ACCAGGGCGG TAGAACTTCA 660 

GGGTGAAGGA CAGAGTCCTG GGTGGGGCAG CGGCTGCAGG GCGCACCA3A GAACCCA3CC 720 

5 

AGAGGGGGTG 7GAGTACCAG TGGTGTTGCT TCCACCCTGC AGCAGGTGGG ATGAGGTCTG 7 30 

TGTGTGTGTG TGAACCATCA TTTTTTGATC ATCATGAGCA ATGAAACATT GAAAAAAAAA 340 

10 AAAAAAACTG GAGGGGGGCC CGTACCCAAN TCGCCGMATA GTSATCGTAA ACAATC 896 



15 


(2) INFORMATION FOR SEQ ID NO: 36: 






20 


(i) SEQUENCE CHARACTERISTICS : 

(A) LENGTH: 912 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : double 

(D) TOPOLOGY: linear 

<xi) SEQUENCE DESCRIPTION: SEQ ID NO: 36: 






25 


TCGACCCACG CGTCCGGTCA GCCAGTCGCA TCCAGCCATG ACAGCCTTCT 


GCTCCCTGCT 


60 




CCTGCAAGCG CAGAGCCTCC TACCCAGGAC CATGGCAGCC CCCCAGGACA 


GCCTCAGACC 


120 


30 


AGGGGAGGAA GAC GAAGGG A TGCAGCTGCT ACAGACAAAG GACTCCATGG 
TAGGCCCGGG GCCAKCCGCG GCAGGGCTCG CTGC-GGTCTG GCCTACACGC 


CCAAGGGAGC 
TGCTGCACAA 


1B0 
240 




CCCAACCCTG CAGGTCTTCC GCAAGACGGC CCTGTTGGGT GCCAATGGTG 


CCCAGCCCTG 


300 


35 


ARGGCAGGGA AKGTCAACCC ACCTGCCCAT CTGTGCTGAG GCATGTTCCT GCCTACCATC 


360 




CTCCTCCCTC CCCGGCTCTC CTCCCAGCAT CACACCAGCC ATGCAGCCAG 


CAGGTCCTCC 


420 


40 


GGATCACYGT GGTTKGGTGG AGGTCTGTCT GCACTGGGAG CCTCARGARG 
ACCCACTTGG CTATGGGAGA GCCAGCAGGG GTTCTGGAGA AAAAAACTGG 


c<:tctgctcc 
tgggttaggg 


430 
540 




CCTTGGTCCA GGAGCCAGTT GAGCCAGGGC AGCCACATCC AGGCGTCTCC 


CTACCCTGGC 


600 


45 


TCTGCCATCA GCCTTGAAGG GCCTCGATGA AGCCTTCTCT GGAACCACTC 


CAGCCCAGCT 


660 




CCACCTCAGC CTTGGCCTTC ACGX7TGTGGA AGCAGCCAAG GCACTTCCTC 


ACCCCiTCAG 


720 


50 


CGCCACGGAC CTYTYTGGGG AGTGGCCGGA AAGCTCCCSG GCCTYTGGCC 
CCCAAGTCAT GACTCAGACC AGGTCCCACA CTGAGCTGCC CACACTCGAG 


TGCAGGGCAG 
AGCCAGATAT 


780 
840 




TTTTGTAGTT TTTATKCCTT TGGCTATTAT GAAAGAGGTT AGTGTGTTCC 


CTGCAATAAA 


900 


55 






912 



60 



(2! INFORMATION FOR SEQ ID NO: 37: 



WO 98/54963 




15 



25 



35 



45 



55 



240 

300 



(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH : 13 32 base pairs 

(B) TYPE: nucleic acid 

5 (C) ST HANDEDNESS : double 

(D) TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTION : SEQ ID NO: 37: 
10 AATTCGGCAC GAGCGGAGGC GAGGGAAACT RAGGGCGAAA GTTGTGTGTC GTGTTGGCAG 60 

GAGGGCCTAG AAGGGAAAGA CTGTCTAGTG GGACAATGTC AT ATT AT AAA TTT<3GAATGC 12 0 

TGAATAGAAA A7TATAGATT TTGATATTGA AGGAAATGAA GCGAA3CYTA AATSAAAATT 180 
CAGCTCGAAG TACAGCAGGC TGTTTGCCTG TTCCGTTGTT CAATCAGAAA AAGAGGAACA 
GACAGCCATT AACTTCTAAT CCACTTAAAG ATGATTCAGG TATCAGTACC CCTTCTGACA 

20 ATTATGATTT TCCTCCTCTA CCTACAGATT GGGCCTGGGA AGCTGTGAAT CCAGAGTTKG 360 

CTCCTGTAAT GAAAACAGTG G AC ACCGGGC AAATAC CAC A TTCAGTTTCT CGTCCTCTGA 42 0 

GAAGTCAAGA TTCTGTCTTT AACTCTATTC AATCAAATAC TGGAAGAAGC GAGGGTGGTT 430 

GGAGCTACAG AGATGGTAAC AAAAATACCA GCTTGAAAAC TTGGRATAAA AATGATTTTA 540 

AGCCTCAATG TAAACGAACA AACTTAGTGG CAAATGATGG AAAAAATTCT TGTCCAATGA 600 

30 GTTCGGGAGC TCAACAACAA AAACAATTAA GAACACCTGA ACCTCCTAAC TTATCTCGCA 660 

ACAAAGAAAC CGAGCTACTC AGACAAACAC ATTCATCAAA AATATCTGGC TGCACAATGA 720 

GAGGGCTAGA CAAAAACAGT GCACTACAGA CACTTAAGCC CAATTTTCAA CAAAATCAAT 780 
ATAAGANACA AATGTTGGAT GATATTCCAG AAGACAACAC CCTGAAGGAA ACCTCATTGT 840 
ATCAGTTACA GTTTAAGGAA AAAGCTAGTT CTTTAAGAAT TATTTCTGCA GTTATTGAAA 900 
40 GCATGAAGTA TTGGCGTGAA CATGCACAGA AAACTGTACT TCTTTTTGAA GTATTAGCTG 960 
TTCTTGATTC AGCTGTTACA CCTGGCCCAT ATTATTCGAA GACTTTTCTT ATGAGGGATG 
GGAAAAATAC TCTGCCTTGT GTCTTTTATG AAATCGATCG TGAACTTCCG AGACTGATTA 

GAGGCCGAGT TCATAGATGT GTTGGCAACT ATGACCAGAA AAAGAACATT TTCCAATGTG 1140 

TITCTGTCAG ACCGGCGTCT GTTTCTGAGC AAAAAACTTT CCAGGCATTT GTCAAAATTG 1200 

50 CAGATGTTGA GATGCAGTAT TATATTAATG TGATGAATGA AACTTAAGTA GTGATAAAAG 1260 

GAAGTTTAGC ATAAATTATA GCAGTTTTCT GTTATTGCTT AATTTACCAT CTCCATAGTT 1320 

TTATAGCTAC TATTGTATTT CACTTGTTGA ATTAAAGTAT TTGAATTCTT TTAAAAAAAA 1380 
AA 



1020 
1080 



1382 



60 



WO 98/54963 



'€7/1)598/11422 



10 



298 



(2 5 INFORMATION FOR SEQ ID MO: 38: 

(i) SEQUENCE CHARACTERISTICS. 

(A) LENGTH : 872 base pairs 
(3) TYPE: nucleic acid 

(C) STRANDEDNESS : double 

(D) TOPOLOGY : linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 33: 

GGGCTACTTC AAAGCCCTGG GCCTTATTTC TTCAGGTAAA AAAATATAAA GTCAGATCTC 6 0 

ATCCCGGCTG GCCATGCTGT TAGACCCTTT CATCCTTCTC TTCTGCCTCT TCTCAACAGC 120 

15 TGCCCAGTCC TGTTTGGAAT TCATATACAT ACAGTTCTAA TACTGATGTA TTTACCCTCA 180 

TAAGCCACTC AACCCAGAAT CTTATTTGAA TTATAATCCA GAAACATCAG GTGACGTGTG 24C 

AGACTACTGT ATGAGAAAGA GACAGTTTAA GGGTCAGTCC AATGGAAAAA AGAGTTCTCA 300 

GAGCTTTCTT TAGCTTATTC TCATCAAAGA GCTTTCTCTG CAGAAGGAAC CTACTGGTTC 360 

CTCCTTTCCA GTCCTAGAAA TCCTGACCTA GAGTGGCTTA ATCCTGCTAG C^CCTCTCTC 420 

25 TCGCACTCTG GTGCCAAATG ACTCCAGGAA CTGGGCCATG ATGTGGTGGG AATGACCTTA 480 

CCCTGAGCAT GTCACTCATG CATTGAACAA CAGCTAAGAG CAGAGCTTAG AGCTTAGAGC 540 

TGGGCCCTGT AAGGTGAGAG GAATCACATC CTGCAGAAGT CTGTCCTGAG AAGCAGGTAC 600 

TCCTGTCACA GCAGAGACAC AGTGGATACC TGAGTAACAA TAATACAAGA CAGGACGTGG 660 

GMACAGCAAA AGATTTGGGT GTCAGAAGAR GC CGAGAAC A CTTYCAGGCA GGAACATTCA 720 

35 RARTTGTTCT TGGAGGAART AGGCMCSAAG GCTGGGCAGG ATTTCMCGGG GCAGAGATGG 780 

AGCAAGCAAT TGAAATGAAA GCCATGGCAT GGGAAAAGGA GCACTGGCCA CAGGGAGTGC 840 

AACGTTGTGA TGCAAGGCCA CTGTGGAGCC AT 872 

40 



20 



30 



45 



55 



(2) INFORMATION FOR SEQ ID NO: 39: 



(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 812 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: double 
50 (D) TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 39: 

GGCAGAGGCT CACCCCAGCA GAGATTGAGG GGGAACCGTG ATGAAATTTT TAAGTATTCT 60 

GCTTGATGAT AATAATTTTY CTCTTATGTT AATGTTGGCT CCGTTTGGGT GTTTAGCTTT 120 

TGAAAGGAGT ATGAAAATGC GGAATGGGGC TTTGGGGCTT GAGGAGGTGT GATCTCTAGT 



180 



60 GTTTAAAAAA TTTAATTGCA CAAATAGAAA TAATTCACCC ACATTATTGA ACCCCACTAA 24 0 



WO 98/54963 



;CT/US98/11422 



AGCATATCCT TTTTGTCCAT ATTCCTTTCC 
GTTGTGATTT GAGCTCGTTC CACTTAAAGT 

5 

ATATTTATTG AATTTCTATT CTGTGTTTTA 
AGGTCTGGTG TACTTGTTCT TTGAAAAGTC 
10 CTTTTTCCTT ATTTCCTTGG GATAATTACC 
TTTCTTTCTT TGGCACTATT ATATAAATTG 
ATTTTTCAAA ATCTGGGTAT TTGTCCTATT 

15 

GCCAAGTCGT TTTCTGTGTG GGTTGAGAGA 
TAACTACTAC AAATCATGCT GAGACCGAGC 
20 TGAGTAAGTT TCGNCATCTG GAAACNTTGN 



299 

TGCTGCCCTC GTGTGTACCA TTATTACTCA 300 

CATTCATAGA TACTTTTGCG TCGTGTTKGA 3 50 

CTTAATTACT TTATTATGGA ACCTTTACAC 420 

TTATGTTGAG CACCATCACT GAGCATATAG 4 80 

CGAAGTGGAA ATACCGAATC AAACTTCTGT 540 

TTTTCCAAAC AAGGCATGTT TAG AAT AG AC 600 

TTGCTCTCTG TATGCAGAAT TCAGCGGGGT 660 

CAGGCTGTGC AGCCCACTGT TGCATAGGAC 720 

TATTTTTGCT GCTTAGARGC TTTGCAGCCT 780 

AA 812 



25 (2) INFORMATION FOR SEQ ID NO: 40: 



30 


(i) SEQUENCE CHARACTERISTICS : 

(A) LENGTH: 1515 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : double 

(D) TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTION : SEQ ID NO: 40: 




35 


AATTCGGCAC GAGGGAAATT CAAGCACTTT TCCTAAAAGA AGGGGGAATG GATGCTGAAA 


60 




CAACACGTNT CCCACAAAGG GAGCAGACAC TGGGCTTGTG AAGCTGCCCC ATACCTTCCC 


120 


40 


CACAGAACTG GGGTCCGGCC TCCCTGACAT GCAGATTTCC ACCCAGAAGA CAGAGAAGGA 
GCCAGTGGTC ATGGAATGGG CTGGGGTCAA AGACTGGGTG CCTGGGAGCT GAGGCAGCCA 


180 
240 




CCGTTTCAGC CTGGCCAGCC CTCTGGACCC CGAGGTTGGA CCCTACTGTG ACACACCTAC 


300 


45 


CATGCGGACA CTCTTCAACC TCCTCTGGCT TGCCCTGGCC TGCAGCCCTG TTCACACTAC 


360 




CCTGTCAAAG TCAGATGCCA AAAAAGCCGC CTCAAAGACG CTGCTGGAGA AGAGTCAGTT 


420 


50 


TTCAGATAAG CCGGTGCAAG ACCGGGGTTT GGTGGTGACG GACCTCAAAG CTGAGAGTGT 
GGTTCTTGAG CATCGCAGCT ACTGCTCGGC AAAGGCCCGG GACAGACACT TTGCTGGGGA 


480 
540 




TGTACTGGGC TATGTCACTC CATGGAACAG CCATGGCTAC GATGTCACCA AG^ivrllGG 


600 


55 


GAGCAAGTTC ACACAGATCT CACCCGTCTG GCTGCAGCTG AAGAGACGTG GCCGTGAGAT 


660 




GTTTGAGGTC ACGGGCCTCC ACGACGTGGA CCAAGGGTGG ATGCGAGCTG TCAGGAAGCA 


720 




TGCCAAGGGC CTGCACATAG TGCCTCGGCT CCTGTTTGAG GACTGGACTT ACGATGATTT 


780 



WO 98/54963 



ICT/11S98/11422 



300 



10 



15 



20 



25 



3CGGAACGTC TTAGACAGTG AGGATGAGAT A3AGGAGCTG AGZAA3ACCG TGG T C J AG* 3 T 8-»0 

GGCAAAGAAC CAGCATTTCG ATGGCTTC3T GGTGGAGGTC TGGAACCAGC TCCTAAGOIA 900 

gaagcgcgtg accgaccagc tgggcatgtt cacgcacaag ga ^tttgagc agctggc::c 96 C 

CGTGCTGGAT GGTTTCAGCC TCATGACCTA CGACTACTCT ACAGCGCATC AGCCTGGC 3C 102 C 

TAATGCACC: CTGTCCTGGG TTCGAGCCTG CGTCCAGGTC CTGGACCCGA AGTCCAAGTG 108 0 

GCGAAGCAAA ATCCTCCTGG GGCTCAACTT CTATGGTATG GACTACGCGA CCTCCAAG3A 1140 

TGCCC3TGAG CCTGTTGTC 3 GGGCCAGGTA CATCCAGACA CT3AA3GACC ACA3GCCC:G 1200 

GATGGTGTGG GACAGCCAGG YCTCAGAGCA CTTCTTCGAG TACAAGAAGA GC03CAGTGG 1260 

GAGGCACGTC GTCTTCTACC CAACCCTGAA GTCCCTGCA3 GTGCGGCTGG AGCTGGCCCG 1320 

GGAGCTGGGC GTTGGGGTCT CTATCTGGGA GCTGGGCCAG GGCCTGGACT ACTTCTACGA 1380 

CCTGCTCTAG GTGGGCATTG CGGCCTCCGC GGTGGACGTG TTCTTTTCTA AGCCATGGAG 1440 

TGAGTGAGCA GGTGTGAAAT ACAGGCCTTC ACTCCGTTAA AAAAAAAAAA AAAAAAAAAA 1500 

AAAAAAAAAA AAAAA 1515 



30 (2) INFORMATION FOR SEQ ID NO: 41: 



35 



40 



45 



50 



55 



60 



(l) SEQUENCE CHARACTERISTICS: 

{A) LENGTH : 704 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : double 

(D) TOPOLOGY ; linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 41: 

AAGATGGTGG CGCCCAGAGC TTCGCTCTAT GCTGCTCCCC TGAGAGAGGC GTTTCCATCA 60 

ACCAGTTTTG CAAGGAGTTC AATGAGAGGA CAAAGGACAT CAAGGAAGGC ATTCCTCTGC 120 

CTACCAAGAT TTTAGTGAAG CCTGACAGGA CATTTGAAAT TAAGATTGGA CAGCCCACTG 180 

TTTCCTACTT CCTGAAGGCA GCAGCTGGGA TTGAAAAGGG GGCCCGGCAA ACAGGGAAAG 240 

AGGTGGCAGG CCTGGTGACC TTGAAGCATG TGTATGAGAT TGCCCGCATC AAAGCTCAGG 300 

ATGAGGCATT TGCCCTGCAG GATGTACCCC TGTCGTCTGT TGTCCGCTCC ATCATCGGGT 360 

CTGCCCGTTC TCTGGGCATT CGCGTGGTGA AGGACCTCAG TTCAGAAGAG CTTGCAGCTT 420 

TCCAGAAGGA ACGAGCCATC TTCCTGGCTG CTCAGAAGGA GGCAGATTTG GCTGCCCAAG 480 

AAGAAGCTGC CAAGAAGTGA CCCTTGCGCC ACCAACTCCC AGATTTCAAA GGAGGTAGTT 54 0 

GCAAAAGCTG TGCCCAAGGG GAGGAAGGAG GTCACACCAA TATGATGATG GTTTTCATGA 60 C 

CTTTGAATGA TATATTTTTG TACATCTAGC TGTATCGAGG CATCAGGCCT GAATAAACAT 660 



WO 98/54963 




CCTTTCTTAA AAAAAAAAAA AAAAAAAAAA AAAAAAAAAA AAAA ^C4 

5 

(2) INFORMATION FOR 3EQ ID NO: 42: 

(i) SEQUENCE CHARACTERISTICS: 
10 (A) LENGTH: 1094 base pairs 

(3) TYPE: nucleic acid 
(C) STRANDEDNESS : double 
O) TC PC LOGY : linear 

15 (xi) SEQUENCE DESCRIPTION: SEQ ID NO: 42: 

GGCAGCTTTC TTACAAACCC ATCCTTCTGA AATGTTGCTT CAAATTCATC CTCTGCTCCC 60 

CAGTCCCACT ATTCCACACA TACTGTTACT GTTTCTTTAT CCTACTTTCT CAATTTTGGA 120 

20 

ACATAGTTGC AGTTACTGCA TTGAATACCT GTGOGTTTGC CTGTTGTTCT GTCTGTCTCT 130 

GTGGTTCTTG TAATANTGGA TCCCAGAGAT AAAATGGACA GTTGTNATGC ACAGTTAATT 240 

25 CAGAAACTAG ACCTTACTTG CTGTGTGAAA TAG CAACT AA ATTCTCAGTG AACTCAGCTG 3 00 

ANCTTTATCT CCTTTTGTTT CCCCAATTTA TAATTTCAGT TCAGGCCCAG AAAGATGGAA 360 

TCCCAGCTAA GAAATACAAG TTACACCCTG TACTAGCAGC CCATGTGTGC ATGTTCTTTA 420 

30 

AGTGCTCTTG CAGCTATGTC ATTTATATTG ATTTCCCTGT ATTATTATAA GCAAAGCAAA 480 

TTTGAGGAAA AAAACCCATA ATACCACACC TCATTTTTTT CAAGTAATAG GGTCATAAGT 54 0 

35 CTCATYCTYC ATATAATATG TTGAGTATGC AGTATATTAT GTGTTAGGCT CTGGANAGGC 600 

AGAGGTTAGA TCATGTWACA GATCATATCK GATTAGGCAG ATAAACAGTA TTTTAACCTT 660 

TTCCTTATTA TATGTAACTT GCTTTCAGGT TTTTTAATGT TACTATTATG TCTTTAATAT 720 

40 

ATTATCTTTA TTTGTACTTT TGTATACAGA GTGATTTTCC TTTTTTAAAA AAAATTGTGT 780 

CTTTAGGATG GATTCCAAAG ATGTGGAATC AGTAGGTTTA AGGAATATGG ATATTTTGGC 840 

45 TGGCAAGGTG GCTCACACCT GTAATCCCAG CACTTTGGGA GGCTGAGGTG GGTGGATCAC 900 

CTGAAGTCAG GAGTTCGAGA CCAGCCTGAC CAACATGGCG AAACCCTGTT TNTACTAAAG 960 

ACACACWWAA AATTRGCCAG TGGTGGTGGC ATGTGCTTGT AGTCCCACTT AGCTACTCGA 1020 

50 

GAGGCTGAGG CAGGAGAATC GCTTGAACCC GGGAGGCAGA GGTTGCAGTG AGGCAAGATG 108 0 

GCACCTCTAC ACTC 1094 

55 

(2) INFORMATION FOR SEQ ID NO : 43: 
60 (l) SEQUENCE CHARACTERISTICS 



W O 98/ 54963 



CT/US98/11422 



45 



3o: 



i ! LENGTH : 13 21 case pairs 

TV 7 1 : nucleic acid 
2) STrAJiLEZNZSS : double 



:<ij S7.-;^;CE ZZSCRIPTICN: SEQ ID NO: 43: 

TGGCT T 7.--C-^r CATCArtCTT rCCTFC-C-CTG GAACTACTGG ACAGACCCTT TTGAGATGTG 60 

10 CCTGTGGTOC TGTCC-AGATG TGTG7AGTGG TCTTAGCTCT TTGTTGAGCT TGTGTGTGTG 12 3 

TTG7GTAi77C TTA3-CTGTAT 3CTGA^.TTG GGCGTGTGTT GGAGGGCTTC TTAGCTCTTT 130 

GG7GAG-~ r ~G TATTTCTATG 7GTTTG7ATC ASCTGAATGT TGCTGGAAAT AAAACCTTGG 240 

15 

TTTGT^AAC-G CTCiTTTTTG TGGGAAGTAA GTAGGGGAAA AGGTCTTTGA GGGTTCCTAG 300 

GCTCCTrrOT ACA-.CAGGAA AATGCCTCAA AGCCTTGCTT CCCAGCAACC 7GGGGCTGGT 360 

20 TCCCAG7GCC 7GG7CCTGCC CCTTCCTGGT TCTTATCTCA AGGCAGAGCT TCTGAATTTC 420 

AGGCC1TCAT 7CCAGAGCCC 7GT7GTGGCG AGGCCTTCCT TTGCTGGAGG AAGGTACACA 48 0 

GGG'GAAC-CT GA7GCTGT A Z TTGGGGGATC TCCTTGGCCT GTTCCACCAA GTGAGAGAAG 540 

25 

GTACTT ACT C rTG7ACCTCC TGTTCAGCCA GGTGCATTAA CAGACCTCCC TACAGCTGTA 600 

GGAACTACTG TGIG.-.G--J3CT GAGGCAAGGG GATTTCTCAG GTCATTTGGA GAACAAGTGC 660 

30 TTTA3TAG7A GTT7A A AJGT A GTAACTGCTA CTGTATTTAG TGGGGTGGAA TTCAGAAGAA 720 

ATT7GAAGAC GAJGATCATOG GTGGTCTGCA TGTGAATGAA CAGGAATGAG CCGGACAGCC 730 

TGGCTG^C.-^ -^C^TTCTTC CTCCCCATTT GGACCCTTCT CTGCCCTTAC ATTTTTGTTT 84 0 

35 

CTCCATC7AC CA.CGATCCAC CAGTCTATTT ATTAACTTAG CAAGAGGACA AGTAAAGGGC 900 

CCTCTTGC-GT 7GA77T7C-CT TCTTTCTTTC TGTGGAGGAT ATACTAAGTG CGACTTTGCC 960 

40 CTA7CCTA7T TGGAAATCCC TAACAGAATT GAGTTTTCTA TTAAGGATCC AAAAAGAAAA 1020 

ACAAAA7GCT AATGAAGCCA TCAGTCAAGG GTCACATGCC AATAAACAAT AAATTTTCCA 1080 

GA AGAAA7GA AA7CCAACTA GACAAATAAA GTAGAGCTTA TGAAATGGTT CAGTAAGGAT 114C 

GA g n rcrrG ttttttgttt t gt tttgttt tgktttttta aagacggagt ctcgctctgt 1200 

cactcaggct ggagtgcagt ggtatgatct tggctcactg taacctccgc ctcccgggt: 1260 

50 CAAGCCA7TC TCC7GCCTCA GTCTCCTGAG TAGCTGGGAT TACAGGTGCG TGCCACCATG 132 0 

CCTGGCTAAT TTTTGTGTr? TTAGTAGAGA CAGGGTTTCA CCA7GTTGGT CGGGCTGGTC 1380 

TCAAACTCCT GACCTCTTGA TCCGCCTGCC TTGGCCTCCC AAAGTGATGG GATTACAGAT 1440 

55 

GTGAGCCACC CGTGCCCTAG C CAAGGATG A GA7TTTTAAA GTATGTTTCA GTTCTGTGTC 1500 

ATGGTTGGAA GACAGAGTAG GAAGGATATG GAAAAGGTCA TGGGGAAGCA GAGGTGATTC 1560 

60 ATGGCT770T GAA7TTGAGG TXGAATGGTTC CTTATTGTCT AGGCCACT7G TGAAGAATAT 162 0 



WO 98/54963 




CT/US98/ 11422 



GAGTCAGTTA TTGCCAGCCT T3GAAT7TAC TTCTCTAGCT T AC AATGG AC CTTTTGAACT 
GGAAAACAC C TTGTCTGCAT TCACTTTAAA ATGTCAAAAC TAATTTTTAT AATAAATGTT 

5 

TATTTTCACA TTGAAAAAAA AAAAAAATTT AAAAAC YC 3G GGGGGGCCCS GWACCCCATT 
NGCCCCTAAG GGGGGGGGTT T 

10 



(2) IMFORMAT ION FOR SEQ ID NO: 44: 

15 (i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 1024 base pairs 
(3) TYPE : nucleic acid 

(C) STRANDEDNE5S : double 

(D) TOPOLOGY : linear 

20 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 44: 





GGGGCACAGT 


TGAAGAAGCG ACCGAGGGAC TGGGAGTCGT TAGTGAGGAT 


GACGCGGCAT 


60 


25 


GGCAAGAACT 


GCACCGCAGG GCCGTCTACA CCTACCACGA GAAGAAGAAG 


GACACAGCGG 


120 




CCTCGGGCTA 


TGGGACCCAG AACATTCGAC TGAGCCGGGA TGCCGTGAAG 


GACTTCGACT 


180 




GCTGTTGTCT 


CTCCCTGCAG CCTTGCCACG ATCCTGTTGT CACCCCAGAT 


GGCTACCTGT 


240 


30 












ATGAGCGTGA 


GGCCATCCTG GAGTACATTC TGCACCAGAA GAAGGAGATT GCCCGGCAGA 


300 




TGAAGGCCTA 


CGAGAAGCAG CGGGGCACCC GGCGCGAGGA GCAGAAGGAG 


CTTCAGCGGG 


360 


35 


CGGCCTCGCA 


GGACCATGTG CGGGGCTTCC TGGAGAAGGA GTCGGCTATC 


GTGAGCCGGC 


420 




CCCTCAACCC 


TTTCACAGCC AAGGCCCTCT CGGGCACCAG CCCAGATGAT 


GTCCAACCTG 


480 




GGCCCAGTGT 


GGGTCCTCCA AGTAAGGACA AGGACAAAGT GCTGCCCAGC 


TTCTGGATCC 


540 


40 












CGTCGCTGAC 


GCCCGAAGCC AAGGCCACCA AGCTGGAGAA GCCGTCCCGC 


ACGGTGACCT 


600 




GCCCCATGTC 


AGGGAAGCCC CTGCGCATGT CGGACCTGAC GCCCGTGCAC 


TTCACACCGC 


660 


45 


TAGACAGCTC 


CGTGGACCGC GTGGGGCTCA TCACCCGCAG CGAGCGCTAC 


GTGTGTGCCG 


720 




TGACCCGCGA 


CAGCCTGAGC AACGCCACCC CCTGCGCTGT GCTGCGGCCC 


TCTGGGGCTG 


780 




TGGTCACCCT CGAATGCGTG GAGAAGCTGA TTCGGAAGGA CATGGTGGAC 


CCTGTGACTG 


840 


50 












GAGACAAACT 


CACAGACCGC GACATCATCG TGCTGCAGCG GGGCGGTACC 


GSTTCGCGGG 


900 




CTCCGGAGTG 


AAGCTGCAAG CGGAGAAATC ACGGCCGGTG ATGCAGGCCT 


GAGTGTGTGC 


960 


55 


GGGAGACCAA 


ATAAACCGGC TTGGGTGCGC AAAAAAAAAA AAAAAAAAAA AAAAAAAAAA 


1020 




AAAA 






1024 



60 
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;' 2) iriFOFXATION FOR SEQ ID MO : 45: 

( ; ) S EQUENCE CHARACTER I ST ICS : 
5 (A) LENGTH: 983 base pairs 

(B) TYPE : nucleic acid 

( C ) STRANEEDNESS : doub 1 e 

(D) TOPOLOGY : linear 

10 (xi) SEQUENCE DESCRIPTION: SEQ ID NO: 45: 

C oACACGGCT GCGAGAAGAC GACAGAAGGG CCCGACCGCG AGCCGTCCAG GTCTCAGTGC 60 

T3TGCCCCCC CCAGAGCCTA GAGGATGTTT CATGGGATCC CAGCCACGCC GGGCATAGGA 120 

15 

GCCCCTGGGA ACAAGC CGG A GCTGTATGAG GAAGTGAAGT TGTACAAGAA 0GCCCGG3AG 180 

AGGGAGAAGT ACGACAACAT GGCACAGCTG TTTGCGGTGG TGAAGACAAT GCAAGCCCTG 240 

20 GAGAAGGCCT ACATCAAGGA CTGTGTCTCC CCCAGCGAGT ACACTGCAGC CTGCTCCCGG 300 

CTCCTGGTCC AATACAAAGC TGCCTTCAGG CAGGTCCAGG GCTCAGAAAT CAGCTCTATT 360 

GACGAATTCT GCCGCAAGTT CCGCCTGGAC TGCCCGCTGG CCATGGAGCG GATCAAGGAG 42 0 

GACCGGCCCA TCACCATCAA GGACGACAAG GGCAACCTCA ACCGCTGCAT CGCAGACGTG 480 

GTCTCGCTCT TCATCACGGT CATGGACAAG CTGCGCCTGG AGATCCGCGC CATGGATGAG 540 

30 ATCCAGCCCG ACCTGCGAGA GCTGATGGAG ACCATGCACC GCATGAGCCA CCTCCCACCC 600 

GACTTTGAG3 GCCGCCAGAC GGTCAGCCAG TGGCTGCAGA CCCTGAGCGG CATGTCGGCG 660 

TCAGATGAGC TGGACGACTC ACAGGTGCGT CAGATGCTGT TCGACCTGGA GTCAGCCTAC 720 

AACGCCTTCA ACCGCTTCCT GCATGCCTGA GCCCGGGGCA CTAGCCCTTG CACAGAAGGG 7 80 

CAGAGTCTGA GGCGATGGCT CCTGGTCCCC TGTCCGCCAC ACAGGCCGTG GTCATCCACA 840 

40 CAACTCACTG TCTGCAGCTG CCTGTCTGGT GTCTGTCTTT GGTGTCAGAA CTTTTGGGCC 900 

GGGCCCCTCC CCACAATAAA GATGCTCTCC GACCTTCAAA AAAAAAAAAA AAAAAAAAGR 960 

KGSGGCCGGT CCCCANTCCC CCC 983 

45 



25 



35 



50 



(2) INFORMATION FOR SEQ ID NO: 46: 



(i) SEQUENCE CHARACTERISTICS : 

(A) LENGTH: 2421 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : double 
55 (D) TOPOLOGY: linear 



60 



{xi: SEQUENCE DESCRIPTION: SEQ ID NO: 46: 
CCGGCTGATC GCTGCCGCTC CGCCAATACA AT AGAGC C AK CCACTACCAG CAGCCTGGCC 60 



WO 98/54963 



305 



|CT/l)S98/11422 



CTCTTCCTGC TTC TC C AG AG AGACCAATC'J AGCCGAA-.C G3C-GTTTGCC TGAGGAGAAG 12- 

GAGGAAGTGA CCATGGACAC AAGTGAAAAC AGACCTGAAA ATGATGTTCC AGAACCTCCC 130 

5 ATGCCTATTG CAGACCAAGT CAGCAA7GAT GACCGCCCGG A3GGCAGTGT TGAAGATG AG 24 0 

GAGAAGAA^G AGAGCTCGCT GCCCAAATCA TTCAAGAGGA AGATCTCCGT TGTCTCAGCT 300 

ACCAAGGGGG TGCCAGCTGG AAACAGTGAC ACAGAGGGGG GCCA3CCTGG TC3GAAACGA 36 C 

10 

CGCTGGGGAG CCAGCACAGC CACCACACAG AA3AAACCTT CCATCAGTAT CACCACTGAA 42 C 

TCACTAAAGA GCCTCATCCC CGACATCAAA CCCCTGGCGG GGCAGGAGGC TGTTGTGGAT 48 0 

15 CTTCATGCTG ATGACTCTCG CATCTCTGAG GATGAGACAG AGCGTAATGG CGATGATGGG 54 0 

ACCCATGACA AGGGGCTGAA AATATGCCGG ACAGTCACTC AGGTAGTACC TGCAGA3GGC 600 

CAGGAGAATG GGCAGAGGGA AGAAGAGGAA GAAGAGAAGG AACCTGAAGC AGAACCTCCT 660 

20 

GTACCTCCCC AGGTGTCAGT AGAGGTGGCC TTGCCCCCAC CTGCAGAGCA TGAAGTAAAG 720 

AAAGTGACTT TAGGAGATAC CTTAACTCGA CGTTCC ATT A GCCAGCAGAA GTCCGGAGTT 7 80 

25 TCCATTACCA TTGATG AC CC AGTCCGAACT GCCCAGGTGC CCTCCCCACC CCGGGGCAAG 840 

ATTAGCAACA TTGTC C AT AT CTCCAATTTG GTCCGTCCTT TCACTTTAGG CCAGCTAAAG 900 
GAGTTGTTGG GGCGCACAGG AACCTTGGTG GAAGAGGCCT TCTGGATTGA CAAGATCAAA 960 

30 

TCTCATTGCT TTGTAACGTA CTCAACAGTA GAGGAAGCTG TTGCCACCCG CACAGCTCTG 1020 

CACGGGGTCA AATGGCCCCA GTCCAATCCC AAATTCCTTT GTGCTGACTA TGCCGAGCAA 1080 

35 GATGAGCTGG ATTATCACCG AGGCCTCTTG GTGGACCGTC CCTCTGAAAC TAAGACAGAG 1140 

GAGCAGGGAA TACCACGGCC CCTGCACCCC CCACCCCCAC CCCCGGTCCA GCCACCACAG 1200 

CACCCCCGGG CAGAGCAGCG GGAGCAGGAA CGGGCAGTGC GGGAACAGTG GGCAGAACGG 1260 

40 

GAACGGGAAA TGGAGCGGCG GGAGCGGACT CGATCAGAGC GTGAATGGGA TCGGGACAAA 1320 

GTTCGAGAAG GGCCCCGTTC CCGATCAAGG TCCCGTRACC GCCGCCGCAA GGAACGTGCG 1380 

45 AAGTCTAAAG AAAAGAAGAG TGAGAAGAAA GAGAAAGCCC AGGAGGAACC ACCTGCCAAG 1440 

CTGCTGGATG ACCTTTTCCG AAAGACCAAG GCAGCTCCCT GCATCTATTG GCTCCCACTG 150 0 

ACTGACAGCC AGATCGTTCA GAAAGAGGCA GAGCGGGCCG AACGGGCCAA GGAGCGGGAG 1560 

50 

AAGCGGCGAA AGGAGCAAGA AGAAGAAGAG CAAAAGGAGC GGGAGAAGGA AGCCGAGCGG 162 0 

GAACGGAACC GACAGCTGGA GCGAGAGAAA CGTCGGGAGC ACAGTCGGGA GAGGGACAGG 1680 

55 GAGAGAGAGA GAGAAAGGGA GCGGGACAGG GGGGACCGAG ATCGGGATAG GG AAAGGGAC 1740 

CGAGAACGAG GCAGGGAAAG GGATCGCAGG GACACCAAGC GCCACAGCAG AAGCCGGAGT 1800 

CGGAGCACAC CTGTGCGGGA CCGGGGTGGG CGCCGCTAGC TGGGAAAACA CTAGAGCTGC 1860 

60 
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306 

AGGTACCAGC CACTCGGCCC CAGGGGGTTA TGGCCACAGA COGATAGGCA CAGTCTCCA.2 :;23 

CACCCTGGAG CCAAGGGTCT TTCACATCAC CTATCCCTAC A7ACATACCA AATGG AAAAG 1*30 

TGGCCATCCT TTTCCCCCCA AACACACCCC CTTAACCTAT CTCTTGGGAC TTAGCCCGA: 2:43 

CCTCCCTCTC ATTTC2CATT AAGTCTGAGA GGCAAGAGCT AGGTTAG3CA AGGAGGTC-CT 2 IOC 

TGGCCAGAGA TGGGGAACAG CCAGGTGCCC CAGTCCTCTG ATTITTCCTC CATCCTGCTT 2 25C 

ACCACCTCCC TGGGTACTTA CAGCCTTCTC TTGGGAACAG CCGGGGCC AG GACTGGGTCA 2220 

CCTATGAGCT GAATCAGCAT CTCCTCCTGA GTCCCAGGGC CCCTGCAGTT CCCAGT'^TCT 22 3 D 

15 TCTGTCCTGC AGCCCTTGCC TCTTTCCCAC AGGTTCCACT TTATATCCAC CTTTTCZTTT 2 24 3 

TGTTCAATTT TTATTTTTAT TTTTTTTATT ATTAAATGAT GTGGTCTATG GAAAAAAAAA 2400 

TAAAAATCTG ACTTAGTTTT A 2421 

20 



10 



25 



35 



45 



55 



(2) INFORMATION FOR SEQ ID NO: 47: 



(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 840 base pairs 
{ B) TYPE: nucleic acid 
(C) STRANDEDNESS : double 
30 ( D) TOPOLOGY : linear 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 47: 

CTCAAACTCC TGAGCTGAAG CGATCTACCT GCCTCAGCTA GGATTACAGG TGTGAGCCAC 50 

CGCACCCAAC CTCAATAAGC KTATTTGATA AAAKATATGC AAGCTCCCTT TATKCACTTT 220 

TCATTCAGAA TGTTTAGTAA TTTGTATTGT TTTTCAGATT TTCAGCCCAA TATATCTCCY 130 

40 TGCCCACTGT GTCACTGTAT TCTACCTAWA CATCATCACG TGTTTCTGCT ATTGGCTGTA 240 

TGATGGAACA CTGCGGCrCA TTTTCCTGAA AACTGCCGAT AGTGCATAGA RTGCTGGGA? 3CC 

GGAAACCAGA ARCTTTGAAT TCAAGCCTTG GTTCTGCCTT GTTTTTGCTT GGGTGGCCTT 3 50 

GAGTCAGCCA CATACCTTTT AAAATCTCAA TTTATTAGAA ATTATTCCAA ATCAAAATCA 420 

AATGAGAAGG TATATACAAA AGTGCTTTAT CCCACAATAA ACTATTCAAG AGAGAGCAAA 430 

50 GGAGAGGACA TTTACTCAAC ACCTCCTAAA AGGCAGCCAG TGAAATTAGG CATTTTATTT 540 

AATCCTCCTG GCAACTCTGA GAGTAAAGCA TTATTAATCC CATTTTGGCT GTTTAAAGAA 500 

ATTATTTGCA CTAGATTCCA GCTGTAGTTT AGYTTCAGAA AAAAAAATCC TGAGATGTGA 560 

ATTCACAGCT TTCTGGGTTT AAAGCCCAAG CTCTATCACA TCATGCTATT ATTGTTACAT "20 

TACTGCTAGT TCTATGAAAA GAAATACTAA TTTATGAAAT ACATCTTATC CAAAAAAAAA 7 30 

60 AAAAAAAAAC TGGGAGGGGG GGCCCGTACC CAAATCGCCG GATAGTGATC GTAAACAA7C 5 40 
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20 



30 



40 



50 



180 
240 



5 12) INFORMATION FOR SEQ ID NC: 48: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 24 3 2 base pairs 
(3) TYPE: nucleic acid 
10 (O STRANDEDNE5S : double 

(D) TOPOLOGY: linear 

(Xl) SEQUENCE DESCRIPTION: SEQ ID NC : 48: 

15 OjCACGAGGC ccggaacgct gaggaagggc CCGTCCCGCC TTCCCCGGCG CGCCATGGAG 60 

CCCCG3GCGG TTGCAGAAGC CGTGGAGACG GGTGAGGAGG ATGTGATTAT GGAAGCTCTG 12 0 

CGGTCATACA ACCAGGAGCA CTCCCAGAGC TTCACGTTTG ATGATGCCCA ACAGGAGGAC 

CGGAAGAGAC TGGCGGASTG CTGGTCTCCG TCCTG3AACA GG3CTTGCCA CCCTCCCACC 

GTGTCATCTG GCTGCAGAGT GTCCGAATCC TGTCCCGGGA CCGCAACTGC CTGGACCCGT 3 00 

25 TCACCAGCCG CCAGAGCCTG CAGGCAYTAG CCTGYTATGY TGACATCTCT GTCTCTGAGG 360 

GGTCCGTCCC AGAGTCCGCA G AC ATGGATG TTGTACTGGA GTCCCTCAAG TGCCTGTGCA 420 

ACCTCGTGCT CAGCAGCCCT GTGGCACAGA TGCTGGCAGC AGAGGCCCGC CTAGTGGTGA 480 

AGCTCACAGA GCGTGTGGGG CTGTACCGTG AGAGGAGCTT CCCCCACGAT GTCCAGTTCT 540 

TTGACTTGCG GCTCCTCTTC CTGCTAACGG CACTCCGCAC CGATGTGCGC CANAGCTGTT 600 

35 TCAGGAGCTG AAAGGAGTGC GCCTGCTAAC TGACACACTG GAGCTGACGC TGGGGGTGAC 660 

TCCTGAAGGG AACCCCCCAC CCACGCTCCT TCCTTCCCAA GAGACTGAGC GGGCCATGGA 720 

GATCCTCAAA GTGCTCTTCA ACATCACCCT GGACTCCATC AAGGGGGAGG TGGACGAGGA 

AGACGCTGCC CTTTACCGAC ACCTGGGGAC CCTTCTCCGG CACTGTGTGA TGATCGCTAC 

TGCTGGAGAC CGCACAGAGG AGTTCCACGG CCACGCAGTA ASCCTCCTGG GGAACTTGCC 

45 CCTCAAGTGT CTGGATGTTC TCCTCACCCT GGAGCCACAT GGAGACTCCA CGGAGTTCAT 

GGGAGTGAAT ATGGATGTGA TTCGTGCCCT CCTCATCTTC CTAGAGAAGC GTTTGCACAA 

GACACACAGG CTGAAGGAGA GTGTAGCTCC CGTGCTGAGC GTGCTGACTG AATGTGCCCG 

GATGCACCGC CCAGCCAGGA AGTTCCTGAA GGCCCAGGTG CTGCCCCCTC TGCGGGATGT 1140 

GAGGACACGG CCTGAGGTTG GGGAGATGCT GCGGAACAAG CTTGTCCGCC TCATGACA'CA 12 CO 

55 CCTGGACACA GATGTGAAGA GGGTGGCTGC CGAGTTCTTG TTTGTCCTGT GCTCTGAGAG 1260 

TGTGCCCCGA TTCATCAAGT ACACAGGCTA TGGGAATGCT GCTGGCCTTC TGGCTGCCAG 1320 

GGGCCTCATG GCAGGAGGCG GCCCGAGGGC AGTACTCAGA GGATGAGGAO ACAGACACAG 1380 

60 



780 
840 
900 
960 
1020 
1080 
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ATGAGTACAA GGAAGCCAAA GCC AGCATAA ACCCTGTGAC CGGGAGGGTG GAGGAGAAGT 144 0 

CGCCTAACCC TATGGAGGGC ATGACAGAGG AGCAGAAGGA GCACGAGGCC AT3AAGCTGG 1500 

5 TGACCATGTT TGACAAGCTC TCCAGGAACA GAGTCATCCA GCCAATGGGG ATGAGTCCCC 1560 

GGGGTCATCT TACGTCCCTG CAGGATGCCA TGTGCGAGAC TATGGAGCAG CAGCTCTCCT 162C 

CGGACCCTGA CTCGGACCCT GACTGAGGAT GGCAGCTCTT CTCCTCCCCC ATCAGGACTG If SO 

10 

GTGCTGCTTC CAGAGACTTC CTTGGG3TTG CAACCTGGGG AAGCCACATC CCACTGGATC 1740 

CACACCCGCC CCCACTTCTC CATCTTAGAA ACCCCTTCTC TTGACTCCCG TTCTGTTCAT 1B00 

15 GATTTGCCTC TGGTCCAGTT TCTCATCTCT GGACTGCAAC GGTCTTCTTG TGCTAGAACT 1860 

CAGGCTCAGC CTCGAATTCC ACAGACGAAG TACTTTCTTT TGTCTGCGCC AAGAGGAATG 1920 

TGTTCAGAAG CTGCTGCCTG AGGGCAGGGC CTACCTGGGC ACACAGAAGA GCATATGGGA 1980 

20 

GGGCAGGGGT TTGGGTGTGG GTGCACACAA AGCAAGCACC ATCTGGGATT GGCACACTGG 2040 

CAGAGCMANT GTKTTGGGGT ATGTGCTGCA CTTCCCAGGG AGAAAACCTG TCAGAACTTT 2100 

25 CCATACGAGT ATATCAGAAC ACACCCTTCC AAGGTATGTA TGCTCTGTTG TTCCTGTCCT 2160 

GTCTTCACTG AGCGCAGGGC TGGAGGCCTC TTAGACATTC TCCTTGGTCC TCGTTCAGCT 2220 

GCCCACTGTA GTATCCACAG TGCCCGAGTT CTCGCTGGTT TTGGCAATTA AACCTCCTTC 2280 

30 

CTACTGGTTT AGACTACACT TACAACAAGG AAAATGCCCC TCGTGTGACC ATAGATTGAG 2340 

ATTTATACCA CATACCACAC ATAGCCACAG AAACATCATC TTGAAATAAA GAAGAGTTTT 2400 

35 GGACAAAAAA AAAAAAAAAA AAAAAAAAAA AA 2432 



40 (2) INFORMATION FOR SEQ ID NO: 49: 

(l) SEQUENCE CHARACTERISTICS: 

{A) LENGTH: 1742 base pairs 
(B) TYPE: nucleic acid 
45 (C) STRANDEDNESS : double 

(D) TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 49: 

50 GTCCTGCAGG AGCTGCACGC GGCCGAGGTG CGCANGAACA AGGAGCAGCG AGAAGAGATG 60 

TCGGGCTAAG GGCCCGGSAC GRGSGGCGCC CATCCTGCGA CGGAACACGT TCG<X?TTTTG 120 

GTTTTGTTTC GTTCACCTCT GTCTAGATGC AACTTTTGTT CCTCCTCCCC CACCCCAGCC 180 

55 

CCCAGCTTCA TGCTTCTCTT CCGCACTCAG CCGCCCTGCC CTGTCCTCGT GGTGAGTCGC 240 

TGACCACGGC TTCCCCTGCA GGAGCCGCCG GGCGTGRAGA CGCGGTCCCT CGGTGCAGAC 3 00 

60 ACCAGGCCGG GCGCGGCTGG GTCCCCCGGG GGCCCTGTGA GAGAGGTGGY GGTGACCGTG 360 
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(2) INFORMATION FOR SEQ ID NO 50: 

55 (i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 1437 base pairs 

(B) TYPE : nucleic acid 

(C) STRANDEDNESS : double 

(D) TCPCLOGY: linear 

60 



GTAAACCCAG GGC3GT3GCG TGGGATCRCG GGTCCTTACG CTGGGCTGTC TGGTCAGCA: 42! 

GTGC AGGTC A GGGCAGGTCC TCTGAGCCGG CGCCCCTGGC CAGCAGGCGA GGCTACAGTA 430 

5 

CCTGCTGTCT TTCCAGGGGG AAGGGGCTCC CCATGAGGRA GGGGCGACGG GGGAGGGGGG 540 

TGATGGTGCC TGGGAAGC CT GCKTGTGCAN 3CGGTGCTTG TTGAACTGGC AGGCGGGTGG 600 

10 GTGGGGGCTG CAGCTTTCCT TAATGTGGTT GCACAGGGGT CCTCTRAGAC CACCTGGCGT 660 

GAGGTGGACA CCCTGGGCCT TCCTGGAAGC CTGCAGTTGG GGGCCTGCCG TGAGTCTGCT 720 
GGGGAGTGGG CATTCTCTGC CAGGGACCCA TGAGCAGGCT GCATGGTCTA GAGGTTGTGG 7 50 

GCAGCATGGA CAGTCCCCCA CTCAGAAGTG CAAGAGTTCC AAAGAGCCTC TGGCCCAGGC 840 
CCCTCCGTGG GACAGCCCCG CCGCCCCTCC CCACCAGGGC TTTGCAGATG TCCTTGAAAG 900 
20 ACCCACCCTA GAGCCCTTTG GAGTGCTOG2 CCCTCCTGTG CCCTCTGCCC TGGTGGAAGC 960 

GGCASCACAA GTCCTCCTCA GGGAGCCCCA AGGGGGATTT TKTGGG AC CG CTGCCCACAG 1020 

ATCCAGGTGT TGGAAGGGCA GCGGGTAAGG TTCCCAAGCC AGCCCCAACA CCCTTCCCAC 1080 

TTGGCACCCA GAGGGGGCTG TGGGTGGAGG CCTGACTCCA GGCCTCTCCT GCCCACACCC 1140 

TCTGGGCTGA GTTCCTTCTT TCCCTTGGAC GCCCAGTGCT GGCCTTGGAG GAC GGTCAGC 12 0C 

30 TGGAGGATGG CGGTGGGGGA GGCTGTCTTT GTACCACTGC AGCATCCCCC ACTTCTCCAC 1260 

GGAAGCCCCA TCCCAAAGCT GCTGCCTGGC CCCTTGCTGT AAAGTGTGAA GGGGGCGGCT 1320 

GAGTTCTCTT AGGACCCAGA GCCAGGGCCC TCAACTTCCA TCCTGCGGGA GGCCTTGGCC 1380 

GGGCACTGCC AGTGTCTTC C AGAGCCACAC CCAGGGACCA CGGGAGGATC CTGACCCCTG 1440 

CAGGGCTCAG GGGTCAGCAG GGACCCACTG CCCCATCTCC CTCTCCCCAC CAAGACAGCC 1500 

40 CCAGAAGGAG CAGCCAGCTG GGATGGGAAC CCAAGGCTGT CCACATCTGG CTTTTGTGGG 1560 

ACTCAGAAAG GGAAGCAGAA CTGAGGGCTG GGATATTCCT CATGGTGGCA GCGCTCATAG 1620 

CGAAAGCCTA CTGTAATATG CACCCATCTC ATCCACGTAG TAAAGTGAAC TTAAAAATTC 1680 

AATCAAATGA ACAATTAAAT AAACACCTGT GTGTTTAAGA AAAAAAAAAA AAAAAAACTG 1740 
CG 



1742 
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Ixii SEQUENCE DESCRIPTION*: 5EQ ID NO: 5 J: 

GGCACGAGCC TCCGCGAACT GTGGAGTCGG CGGAGGGCTG GAATCAGCGT GGGCTCCAGG 60 

5 TCGCTGGCAG CCGGGTGGCA GAACTCTTCC GAGGCTCCTT GGGAAGAAGC TACACCCGAG 12 0 

GGAGCCGGAT GGOCCTCGAA AACCTGGCCC GCTCTGGTTC TCTACCA7TG CAAGGGGAAC 18 0 

CGTAAACTGA GCTTTTCTAA CGTGGGTTTC TGCCAAGTAC TTTTCCAGCT GCCCCCTTC: 24 C 

10 

CCCCAGCACA CAGGAGAGCC TCTGTGTAGC CAGCGCTTGA CAGTCGTTAG GTAGGTTGTA 3CC 

CTGTGTAGGG AGGAGCTCAA GATCATGAAT GGTTGTCACA GGAGAAAGCG GTTGCATCTT 360 

15 TGCAAAACTA TATACCTGCT GTGGTTTGTG TTTTCTTTTC TGCTGAGTAA TGAAGTTGTA 42 0 

AGTTCACACT GGCACATTCT CAGGGCTGTG CAGATTATTT GCACTTTATT TCATAGGTGR 480 

ATAAGTGCTT TTTAGCTTTC TTTGTATATT GAGTTGCTTT TGAATTGCTT CCCATATTTT 540 

20 

TATTTCATAC AAACTGAACA ATTGTGGCCC CTCTATTTTA TTTATAAAGG TTCAGTGTAT 600 

CTTTGCCTGC CTACATCAAT CTGCAAGGGA GTTGCAGAAA GC CTCATGTT CATCGAGCCG 660 

25 TGAGTCACAA CCAATTTCTA AGCTGTTATA ACAAAAAAGT GTTTGCTTTT TTTCACAAGT 720 

AACTTTAAAA GTGTAGTTTA GAAAGAAAAC ATTTTCAATA AAAAGACACT ACATTAATCC 780 

TGGATGCTTG CAAATCCTAA AATMTATTCC TCCTCTAGCG TTGCACAGCT CTGTGTTGTA 840 

30 

TACACAGACT AGCTTTAAAA TTTGTCACAT ACCACTTTAC CTTTACTTTT ATGTATCATT 900 

CCCCCGACTT CCTTACTGCA GGTGTGGGCA AGAAAACTTT TCCTTTAACA CTTTTCAACA 960 

35 GCGGGCATAA AATTCTGCAG CTGAGGTCTT GAAGAATGCA GATGGGTACA GTATGTGTTG 1020 

GAGCTCACAG TGTGTATTGA CTAACCTAGT TCCTTTTTTG CTTTTTTTGG TATTGTCTTG 1080 

TTAAAAGTGA CTCCCAGGTA GCAACTCTCT TTTTTAAGGG TGGGAACGAA AGGGACGTAG 1140 

40 

GAAGAATAGA TCTAGATTAT TTAACAGTCT TCGATAGAGT TTGAAAGCTT TCTTCTTCAT 1200 

TCAATTTTGG GCAAAATACT GCCTCTGCAT TTGTTCATAA CAAAAAGATT AGATTAATAA 1260 

45 GTAGCTTTTG TTGGTGGAAA TTACC AGCTC TATAAGTCAC CGTTGGTGGT TCATGGACCT 1320 

CTGATTAGCT TGGGTTTTGC AGTCTCATTG CCACATGTAT ATGTGGAGCC AATGGCCTTT 1380 

TGGTGCTCAG CTGTTTACGT CTGACTCCTT GACTTCTTTG GTACAGTGAT GGAGTCAGAT 1440 

50 

CTCATTAAGT GTGATTCTCC ATGGATATAA CCAGCCCCAA AAAAANG 1487 



55 

(2) INFORMATION FOR SEQ ID NO: 51: 

(i) SEQUENCE CHARACTERISTICS : 

{A) LENGTH: 13 23 base pairs 
60 (B) TYPE : nucleic acid 



WO 98/54963 ^BCT/tS98 11422 



15 



25 



{ C ) STRANDEDME3 3 : doub 1 e 
(D) TOPOLOGY: lir.ear 

(xi) SEQUENCE DESCRIPTION: SEQ 10 NO: 51: 

5 

GGOACGAGOT OGTGCCGAAT TCGGCACGAG AGAAGATTTG AAGAAGCCAG ATC CAGCTT C 50 

OCTGCGGGCT GCTTCTTGTG GGGAAGGGAA AAAGAGGAAG GCCTGTAAGA ACTGCACCTG 12 0 

10 TGGCCTTGCC GAAGAACTGG AAAAAGAGAA GTOAAGGGAA CAGATGAGCT CCCAA.CCCAA 180 

GTCAGCTTGT GGAAACTGCT ACCTGGGCGA TGCCTTCCG: TGTGCCAGCT GCCCCTACCT 24 0 

TGGGATGCCA GCCTTCAAAC CTGGGGAAAA GGTGCTTCT 3 AGTGATAGCA ATCTTCATGA 300 

TGZCTAGGAG G TTCCTGAC A TGGGACCOAT CTGCTCCTCC AGCCAACTCC TGTCOCTCAC 36C 

ATOCCACCAT GGTGGCTCCT CCCACCTCCT CTCGATTTGT TCACTCT3AG ATCTGTTTGC 420 

20 AGAGTGGGTG CTTAGCAGAC AGAGTGAAGC TGGCTGGGGG GCACAGTGGT GTGTAGTGCT 480 

GCTGTGTATC AAAAGACCAA GGTATTATGG GACCTGGTTT CAGAATGGGA TGGGTTTCTT 540 

CACCTCATGT TAAGAGAAGG GAGTGTGTCC TGAAGAAGCC CTTCTTCTGA TGTTAAAATG 600 

CTGACCAGAA CGCTCTTGAG CCCAGGCATC GTTGAGCATT AACACTCTGT GACAGAGCTG 660 

CAGACCCCTG C CTTGAGTC T CATCTCAGCA ATGCTGCCAC CCTCTTGTCT TTCAGAGTTG 720 

30 TTAGTTTACT CCATTCTTTG TGACACGAGT CAAGTGGCTC ACAACCTCCT CAGGGCACCA 780 

GAGGACTCAC TCACTGGTTG CTGTGATGAT ATCCAGTGTC CCTCTGCCCC CTTCCATCCO 840 

CAACCACATT TGACTGTAGC ATTGCATCTG TGTCCTGTTG TCATTTATGT TAACCTTCAG 900 

GTATTAAACT TGCTGCATAT CTTGACATAT CTTGAGATTC TGCATGTCTT GTAAAGAGAG 960 

GGGATGTGCA TTTGTGTGTG ATGTTGGATA GTCATCCACG CTCAGTTTGG ACCATTGGAG 1020 

40 GAACTTAGTG TCACGCACAA ATGGGGCTAT TCCTACGCTT AGAATAGGGC TTGTCTGCCC 1080 

ACTTTAGAAG AGTCCCAGGT TGGTGAGCAT TTAGAGGGAA GCAGGGCAGA ACTCTGAACG 1140 

ACAATACGTC TCTCTGAGCA GAGACCCCTT TGTTCTTGTT ATCCACCCAT ATGGACTTGG 1200 

AATCAATCTT GCCAAATATT TGGAGAGATT GTGTGGATTT AAGAGACCTG GATTTTTATA 1260 

TTTTACCAGT AAATAAAAGT TTTCATTGAT ATCTGTCCTT GAAAAAAAAA AAAAAAAAAA 1320 

50 AAACTCGA 1328 



55 (2) INFORMATION FOR SEQ ID NO : 52: 

(l) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 1356 base pairs 
{3! TYPE: nucleic acid 
60 (C) STRANOEDNES3 : double 
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GAATTCGGCA CGAGC7CTGC AAjG AT*? GCAA ATXA^CTT?:-: -Z-Z Z G-.GGGT* TCCGCTGCZC 5 0 

CC77AGATTAA. A77CCCCGGG CTCAAA. 77GA. G77C-CA-GA77 7 A.—---. TAT C_A TATTTTAAAT 12 3 

TGCTGTCTTC AATT AA-AC CA TTTTA-TGA. ~ HA 7AAC7AJA777 77-AG7_ATC7C GATGCATC-CT 180 

TTTCCAGGCC TTC777TCTT7 GTAJ2AJAA-. 77" AAA7GT7:CA7 .AAAC^GTTTC ACTTATATTC 240 

TTC AAACATG ATGCTAATTT AAA.TT AA.Z7T A. CTTCCTA/TGA 7A.TGTTATTA TTCCT ATGAT 3CC 

15 TTTGCCACTC TTATTAGTTTC TCTCAAAAA.T AC-.7C77AGC-_ AA_~GGATTA TTTTAAGTPA 3 60 

TTTGATTATC TTTCTATCTC TTTTATTT AT TTCTCA-rTTA CTTAAGAAAT TCGTTCCA7T 420 

GGTTGGCATT GATACAGTAA ATT7GT AAA. 7 GAGGA-GACAA 7A7AAAAAA.7 CTAAATTACT 480 

TGTGCTTAAT GAC7GTAGCA GAA.TSC 77777 TC7C7AAA73 AG.A77GTC7T TCTTGCAG7T 540 

TAGTTTGATA GA77TGCAAG CTATGCTC-77" TC CLA7G1AAG7 7AC-77GCGC7 GGTAGGAACG 600 

25 CftGGCTTCTT TGTCTCTGGT TGTAGCTTOG ATGA7CGCCZ GA.T7AGGCAG ACAACGTAGC 660 

CGGAGATCAC AAA7CAGGCC CT77GG7G7A-G TTGC7A*G7'G7 3C7--AGGTOC AGAGAGG77G 720 

GCAGAAACTG ACC7CA.CTGG GCAAGC-GTC-G CCA7C-GACC7 GAT7CTTTAA TGCACTCTAT 78 0 

30 

GTGTTCAGGA AGCGACAGGC CA7AT77G.-.7 TC 7GAGAAAG AA-AACAAGA3 GAAAAACCCC 84 0 

ACAAAGTATA ACAACCCCTT A-AGA.7 A 7A7C TA777^AA_A7 7GAA_-.7TAA_7 TTTTCAG77T 900 

35 ATA.C CATTGG CCAA7CACAA GA7AAAA.A7C- TTCAA.7T7C7 TCAAGAATC C TTTGTTGACT 960 

TGTCTTTTCA TCTCT7GCTA TTTATATT^G TCAC7G7TA7 7CA.ACAAAGT CTTATTTGCT 1020 

GAGGAAGGAC TTTGCTGCAC TTACTG7AC7 AC-.7CA_-.ACA. 7777-7-7GAGGG TGGTGTTTAA 1080 

40 

CTTTTTAAAA AATGTTATTC TGATTA7AAJ7 AA.7AA.7 > A7*?G 3CT77TTTCA TGAAAAGAGC 1140 

GCCACCTTGC AAGGT7TAG7 GAGAT77A7G GA_-.G777G.AA7 A7 77 AAGCAG GAA7TGC7GC 1200 

45 TAGCTCCAAA AA77TGCGAA GCAAAJAGC7.A GCCCCAAT77- G77TGGAAGT TTGAAACTGA 1260 

TTAACAGATT TGCATTTGAA. GTGAC77C.AG ACA7TAGG77 CA^AGATTAG 7TAAAAATAG 1320 

AAAGAGGAAT AAAGACATC7 YTTCTCTCTA GAAAA3AT.AA GACGr.CAATT AATAATCCTT 1380 

CCCACTTTCA TTGAGATCAG CTTGTC7GAT AA.7C7GA7A7 GAG7G7GA7A ATGATAAAGA 1440 

TGATAATAG7 GG7ACTTTTG TA-A7T77GC7 GG7GCAT7TA AGAAGATAGT AAAKGA7GAG 1500 

55 TTCAYCTTTT CTYCGAACA7 YCCTA7VCCT AGA.7G77AG77 7JACC7CAA-AT TGGGAATTAT 1560 

AACTGTCCTA ATTTTTGTTG 7X77 AC C777GA 7G7CGCT777 7-7777 AA7AC CCACAGTG7A 162C 

ACAATTAAAT A7CACACTA7 GACATA.7GA.7 77.-Ai77.AC-G-. 7A7777AAAG ATAAA7T7TA 1680 

60 
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GGGGTAAATG TTTACTTCAA AATGA2TCCA TATTTCAAAT ATCTGTTTAG 


-VCTGTGAAGG 


1740 




CCAAATAATT TTTAAGAAAA CATTTGAAGA GTAGTGTGTT TGCATTTGTG 


rtATAATCTTA 


13 00 


5 


CTCACAGCAA GTAAACGTAA TAAAAGCCAA CATTTAAGCC AAAAAAAAAA 


AAi^AAA 


1856 


10 


(2) INFORMATION FOR SEQ ID NO: 53: 






15 


(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 1558 base pairs 
(3) TYPE: nucleic acid 

( C ) STRANDEDNESS : doub 1 e 

(D) TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 53: 






20 


TGGGTATCCA TTCCTGKAAT TACTTTACTT AGGATAATGG CCTCCAGCTC 


CGTCCAAGTT 


60 




GCTGCAAAAG GTATTATTTC GTTCCTTTTT GTGGCTGAGT AGTATTCCAT 


GGTGTATATA 


120 


25 


TACCACATTT TCTTTATCCA CTCATTGCTT GATGGGCAGT TAGGTTGGTT 
GCAATTGTGA GTTGTGCTGC TCCAGATATC ATCTTTAACT CCTTTGCCTT 


CCACATCTTT 
CTCCACATAC 


130 
240 




ATTTCCAAGT CCTGTTCATT CTACCTCCAA AATGTATCTT GTATCCATTC 


ATCTCTCTCC 


300 


30 


ATCTTCAATC TATTTCAATG CCCCATCATC TCTTGCATGG AGGAGTGTAA 


TAATTGGCTA 


360 




ACTGGCCTGT TCTTACATTT TAAAATCAAA AGATGTGACA GGTGAAATGC 


CTATTTCAGT 


420 


35 


GTCCATTGAT GGTTCTGCTT ACACACCACC TGGCTGCCTG GTGTCGCAGT 
AGCAGTGTGA AAAAGACTGC TTGGCCCTTT ACAGGGAAAG CAGGTCCACT 


GGCAGAGTTG 
GTGGCCTGTG 


480 
540 




AGGACGAGAG CTCTGGGCAG GCTCGGACAC TGGCAGACCC TGGTCCTGGC 


TGGCCAAGGC 


600 


40 


AGCAGGGTAT GTGTTTCGGG TCACTCACAG GGCTCAGCAC CACTCCTCAT 


GGCTTCCTTA 


c cr\ 




CTGTTTCGGC AGAGGCTGAC CCGCGGCTGA TTGAGTCCCT CTCCCAGATG CTGTCCATGG 


1 Z\j 


45 


GCTTCTCTGA TGAAGGCGGC TGGCTCACCA GGCTCCTGCA GACCAAGAAC TATGACATCG 
GAGCGGCTCT GGACACCATC CAGTATTCAA AGCATCCCCC GCCGTTGTGA CCACTTTTGC 


780 
840 




CCACCTCTTC TGCGTGCCCC TCTTCTGTCT CATAGTTGTG TTAAGCTTGC 


GTAGAATTGC 


Qnn 


50 


AGGTCTCTGT ACGGGCCAGT TTCTCTGCCT TCTTCCAGGA TCAGGGGTTA 


GGGTGCAAGA 


960 




AGCCATTTAG GGCAGCAAAA CAAGTGACAT GAAGGGAGGG TCCCTGTGTG 


TGTGTCTGCT 


1020 


55 


GATGTTTCCT GGGTGCCCTG GCTCCTTGCA GCAGGGCTGG GCCTGCGAGA 
ACTGCAGC GC GCTCCTGACC CCTCCCTGCA GGGGCTACGT TAGCAGCCCA 


CCCAAGGCTC 
GCACATAGCT 


1080 
1140 




TGCCTAATGG CTTTCACTTT CTCTTTTGTT TTAAATGACT CATAGGTCCC 


TGACATTTAG 


120C 


60 


TTGATTATTT TCTGCTACAG ACCTGGTACA CTCTGATTTT AGATAAAGTA 


AGCCTAGGTG 


1260 
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TTGTCAGCAG GCAGGCTGGG GAGGC2AGTG TTGTGGGCTT CCTGCTC-C-GA CTGAGAAGGC 13 2C 

TCACGAAGGG CATCCGCAAT GTTGGTTTCA CTGAGAGCT 3 CCTCCTGGTC TCTTGACCAC 1380 

5 

TGTAGTTCTC TCATTTCCAA ACCATCAGCT GCTTTTAAAA TAAGATCTCT 7TGTAGCCAT 1440 

CCTGTTAAAT V rGTAAACAA TCTAATTAAA TGGCATCAGG ACTTTAAC2A AAAAAAAAAA 150C 

10 AAAAAAAAAA AAANAAAAAA AAAAGGGGGC CGCTCTAGA3 GTCCAAGTTA NGAC 3N-3G 1559 



15 (2) INFORMATION FOR 3EQ ID NO: 54: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 948 base pairs 

(B) TYPE: nucleic acid 
20 (C) STRANDEDNESS : double 

(D) TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 54: 

25 TAAAAATCAT GCTCTGTAC 2 ATCCTCACCG TAGTCATCAT CATC3CCGCG CAGACCACGA 60 

GAACTACTGG GATCCCTAAA AACGCCCCTG GTCCGGCCCC ACTCTGCGCC CCTCGATCTC 120 

CCAGGCTCTT TCTGCAGWCA TACCGCGGAC CCAATGGGCG CCCTGCACAT CCGTTTCTGG 180 

30 

GGCCGTCAGA CTTGGATACA TCGTAAACTC CGCCTCCACG GAACGTCTCG CCTK'SCGAGC 240 

AAGMTCGGAA TCCAGTTCCT CAGGAACCCC TCCAAAACCC ACACCCCCAG GGACGCCGCT 300 

35 TTCCGGGATC CCGGSCAAAC GCCGGACCCT CAGTCGCTCC AGGCCCCCTG ACCCTCAAAG 360 

TGTAGCGCCC CCAACCGAGC AACCTCGGTT TGGTCCCTAA AACCCCGCGT CCTCTATAAG 420 

CACCGCCCCA GCTCTGACAA AACCCCGCGT CCAGGTCGGC AGGCTCCGCT TCTTTTCTTC 480 

40 

TCCGCGGGGT GATTCAGTCC AGTGATTGGG TTTGTGGCTC CAGGCCTCGC CCACAGACGG 540 

ACAGACCCCT CCCTTTCTTC CGGCAAAAGG ACCGAGCCCT GGGGTAGTAA GG SCGCCACA 600 

45 CTCCTGTTTT TTGCAAGTAC ATTTTTGTCC YTCCTCCACC CAGGTATCTG CCTATTTTCT 660 

TGCTAATCCC AGAACCTTTC CTTTTGCTTT TTTTAAGGAC ATTTGGGAAG TTCCTGGTOT 72 C 

AGGACCCTTC TCCCTGGGAT AAGAAACCTG CCTGTAAACG CTCTGTAAAT ACTCCCTTCC 780 

50 

ACCCATCCCA GCCCCTGGGC AGCCGGGCAG AAGGGAATCC AGGCTATGGA CCTCCCAAGT 840 

CCCCGCTCCC CGCTCCCCTC GGCGGCCCCG CCTTGTTCTG ATCTGTGTGT GAGTGTGTGT 9C0 

55 GAACTTCTGA AAGACAATAT TAAAGAGACT TAGTTGAAAA AAAAAAAA 948 



60 (2) INFORMATION FOR SEQ ID NO: 55: 
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( i > SEQUENCE CHARACTERISTICS : 

(A) LENGTH: 990 base pairs 
(Bl TYPE: nucleic acid 
5 ;c) STRANDEDNESS : double 

(D) TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 55: 

10 GGGGAACTGC AGTGACAGCA GGAGTAAGAG TGGGAGGCAG GACAGAGCTG GGACACAGGT 6C 

ATGGAGAGGG GGTTCAGCGA GCCTAGAGAG GGCAGACTAT CAGGGTGCCG GCGGTGAGAA 120 

TCCAGGGAGA GGAGCGGAAA CAGAAGAGGG GCAGAAGACC GGGGCACTTG TGGGTTGCAG 130 

15 

AGCCCCTCAG CCATGTTGGG AGCCAAGCCA CACTGGCTAC CAGGTCCCCT ACACAGTCCC 240 

GGGCTGCCCT TGGTTCTGGT GCTTCTGGCC CTGGGGGCCG GGTGGGCCCA GGAGGGGTCA 300 

20 GAGCCCGTCC TGCTGGAGGG GGAGTGCCTG GTGGTCTGTG AC<:CTGGCCG AGCTGCTGCA 360 

GGGGGGCCCG GGGGAGCAGC C CTGGG AG AG GCACCCCCTG GGCGAGTGGC ATTTGYTGCG 420 

GTCCGAAGCC ACCACCATGA GCCAGCAGGG GAAACCGGCA ATGGCACCAG TGGGGCCATC 430 

TACTTCGACC AGGTCCTGGT GAACGAGGGC GGTGGCTTTG ACCGGGCCTC TGGCTCCTTC 540 

GTAGCCCCTG TCCGGGGTGT CTACAGCTTC CGGTTCCATG TGGTGAAGGT GTACAACCGC 600 

30 CAAACTGTCC AGGTGAGCCT GATGCTGAAC ACGTGGCCTG TCATCTCAGC CTTTGCCAAT 660 

GATCCTGACG TGACCCGGGA GGCAGCCACC AGCTCTGTGC TACTGCCCTT GGACCCTGGG 720 

G AC CGAGTGT CTCTGCGCCT GCGTCGGGGG NAATCTACTG GGTGGTTGGA AATACTCAAG 730 

TTTCTCTGGC TTCCTCATCT TCCCTCTCTG AAGGACCCAA GTCTTTCAAG CACAAGAATC 840 

CAGCCCCTGA CAACTTTCTT CTGCCCTCTC TTGCCCCANA AACAGCANAA GCAGGANANA 900 

40 NACTCCCTCT GGCTCCTATC CCACCTCTTT GCATGGGAAC CTGTGCCAAA CACCCAAGTT 960 

TAAGAAAAAA ATAAAACTGT GGCATCTCCA 990 



25 



35 



45 



(2) INFORMATION FOR SEQ ID NO: 56: 



(i) SEQUENCE CHARACTERISTICS: 
50 (A) LENGTH: 16C3 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : double 

(D) TOPOLOGY: linear 

55 (xi) SEQUENCE DESCRIPTION: SEQ ID NO: 56: 

GGTCGACCCA CGCGTCCGGC CCGCCGGCTC CGGAGCGGCT CTGCCTTCCC GAGCGCGGGA 60 
CCGCGCCCTG GGGGAGGAGG GCGAACGACG CGGCGATGGC TCCGCGGGCA CTCCCGGGGT 120 

60 
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CCGCCGTCCT AGCCGCTGCT GTCTTCGTGG GAGGCGCCGT GAGTTCGCOG CTGGTGGCTC ISC 

CGGACAATGG GAGCAGCCGC ACATTGCACT CCAGAACAGA GACGACCCCG TCGCCCAGCA 240 

5 ACGATACTGG GAATGGACAC CCAGAATATA TTGCATACGC GCTTGTCCCT GTGTTCTTTA 3 CO 

TCATGGGTCT CTTTGGCGTC CTCATTTNGC CAMCTNGCTT MAAGAAGAAA GGCTATCGTT 360 

GTACAAGAGA AGCAGAGCAA GATATCGAAG AAGAAAAAGG TTGAAAAGOT AGRATTGAAT 42 0 

10 

GACAGTGTGA ATGAAAAC AG TGACACTGTT GGGCAAATCG TCCACTACAT CATGAAAAAT 43 0 

GAAGCGAATG CTGATGTYTT AAAGGCGATG GTAGCAGATA AGAGCCTGTA TGATCCTGAA 54 0 

15 AGC CCCGTGA CCCCCAGCAC ACCAGGGAGC CCGCCAGT3A GTCCTGGGCT TTGTCACCAG 6CC 

GGGGGACGCC AGGGAAGCAC GTCTGTGGCC ATCATCTGCA TACGGTGGGC GGTGTWGTCG 660 

AGAGGGATGT GTGTCATCGG TGTAGGCACA AGCGGTGGGA CTTTATAAAG CCCACTAACA 720 

20 

AGTCCAGAGA GAGCAGACCA CGGCGCCAAG GCGAGGTCAC GGTCCTTTCT GTTGGCAGAT 7 80 

TTAGAGTNAC AAAAGTGGAG CACAAGTCAA ACCAGAAGGA AGGGAGAAGC CTGATGTCTG 840 
25 TTAGTGGGGC TGAAACCGTC AATGGGGAGG TGCCGGCAAC ACCTGTGAAG AGAGAACGCA 900 
GTGGCACAGA GTAGCAGGTG AGCCGTGGTT TTGGTG AC AT TGGGGGCAGA GTGGTGCAGG 960 

GTGAGGAGAA GGTACTTGGA GCCTCCCAGG TGCTGTGGCA GCATAGGAAT GGTATTTGAC 102 0 

30 

AGGGAAGTGG GAGAGCTTTC CTTGACCCAG GAAGACTGAG GGGGACTGAA CATGATTACT 108 0 

TGTCTGCCTA GAGCTTCTTG TAAAGAAGTC ACAAACTTAG TGCCTCCAGG GGCTTGGCTG 1140 

35 TGTGATAATG AGGATAGAGG ATTACTTGTG AGGCAATGTG GCATGGTGGG GATTGTGGCA 1200 

AACTAGAATT CACATCACCC ACCATATAGG GCTTGCATTA CC ACGAGGCA G AAAGC AC CT 1260 

AGTGTTGCTG CATCTTCTTA CGCAAAAAAG ACAAAATCCA GACTTCTAAA ATGTAAAATC 1320 

40 

ACTGATTTTC GATATTGGCA GCTTACTTTT TTTTTTTAAA CAACCATGCA GGCCAAATGA 1380 

CTTGTAATCT TGTCACCATT TTTAGGTAAA CTGTGACTTG AAAAAGTCTG GAGCAAACAA 1440 

45 ACCAATGCTT TTTCCTTTTA TTCTGTTGGR AACCAGTTTT CTTTGTGTCA CAGTTYTGAA 1500 

ACCTCAATAC GAATATTTCT CTTCCCACCA AATATTTTGA GGCAATTGAA AAGCCACAGT 1560 

GATTTATTTC TTGATTTGGC AATTTTAATT TTGCAAGACA ATT 1603 

50 



(2) INFORMATION FOR SEQ ID NO: 57; 

55 

(i) SEQUENCE CHARACTERISTICS : 

(A) LENGTH: 1052 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNE3S : double 
60 (D) TOPOLOGY: linear 
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(Xi) SEQUENCE DESCRIPTION: SE; ID NO: 57: 
TACAGCTCA3 GATGCCTGTA ACATTGTC AT CTCTGGGCTT CTGGGTCCTG CTTAGCCTGC 
TTTT7CCCTG GAGGACTGAC CAGGGATGCG GCCCAGCAAC ATGTTA'rTAA ATCATACT3T 
CCTCCCTACC TTTCCCAGAC CTCTCACTCC TGCCTGGTGT TCCAACCCGT TCTGTGGCCA 
GAGTATACAT TTTGGAACCT CTTCGAGGCC ATCCTGCAGT TCCAGATGAA CCATAGCGTG 
CTTCA3CAGN AAGGCCCGAG ACATGTATGC AGAGGAGCGG AAGAGGCAGC AGCTGGAGAG 
GGACCAGGCT ACAGTGACAG AGCAGCTGCT GCG AG A 3GGG CTCCAA3CCA GT 3GGGACGC 
CCAGCTCCGA AGGACACGCT TGCACAAACT CTCGGCCAGA CGGGAAGAGC GA3TCCAAGG 
CTTCCTGCAG GCCTTGGAAC TCAAGCGAGC TGACTGGCTG GCCCGTCTGG GCACTGCATC 
AGCCTGAATG AGGCTGGCCA CCTGCCACTT TGCCCTGCCC TCTGCCTCCA 3GGCTCCMCT 
MYCCTTCCTT TTCTTGGTGA AAGOIAICTC CTTTCCTGAT AATGAATGGT GTTCCCTTTG 
CTTGGCTGGG GAGCCCCCCA G3CCAG3TTT GCTGGCCATA GATACCTTTG -jGCTGCCTGR 
GACAGGCTCC TGAGGAGGAT TGAGGGTGAA AGTCTCCCAC GAGTACACTA AACCTAGGTI 
TGGTCACCAA TAGGGTTTGG AGAGCAAAGG GCCACAACTC ATCAGCTGCC TGTITCTTA3 
ATGCACTTTC TTTTTCCACC AGCACATCCT TCAACACACA GAATTTCAGG GAA3AGTTCT 
CCCCAAAACC CTAGCTCTTT ACCCTTCCAT TTTAGCCTTC CACCCAGCTT CCACAAAAGA 
TTTGGCTCTA CCTTGGATCT GCTAGTAAAT AACTAATAGG CAGGCAGTTA TTT'GGGTAAG 
GAAAAAAGGG GTGGGAGAGA CAGAAAATTT GCCCACTGCT GCTCCTCCCC TTGGSTYTCC 
ACCTGGGATT TGCTATTGAA TCTCTACCCT NN 

{2) INFORMATION FOR SEQ ID NO: 58: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 814 base pairs 
(3) TYPE: nucleic acid 

(C) STRANDEDNESS : double 

(D) TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 58: 
ACNCGNTGGC GGCCGCTCTA GAACTAGGGG ANCCCCCGG3 CTGCAGGAAT TCGGCACGAG 
CATAGACTTT TAAACTGGTA CGGTTCTTAG AGATGGTCCT TGGCCTTCTG TTGTTGTTGT 
KGTTTTTTTC TTTTTCTTCT TCTCCTTCTC CTTCTTCTTC TCTTCTCCTT CTTTCTTCTT 



TTTTTTTTCA GAGTCTTGCT CTGTCACCAA GACTGGAGTG AAGTGATGTG ATCTCGGCTT 
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ACTGCAACCT GGGAGGCAGA 'JGTTGCAGTG AGTCGAGATG GTGCCATTGC TCTCGTTTGG 
GCAACAAGAG TGAAACTCTT GTCTCAAAAA AAAAAAAAAA ATGAGGTTTA AG A CAGTTTT 



GTCATTACTG GTGGGATCTG GTCACACAAG ATAGCATTAA ACGTGACA7G GCAGATAAAA 
TIXjGTTAAAA AATTTTGTTT TTTAATTACG TAATGTAAAA GCCCAACAAA CACTTTATGC 
AAGATTGGAA TGTATCTTCA AATTCAGATT TAATAAACAT GTAAAGATCC TCTGTATATA 
AAAGTTGTAT TTAATCCCTT GTGCCCCAAG AATGCTATAA AAGATCCCAA GAATGTTATC 
TATGAAAAGA TAGCAATAGG GAATGGTGAA CAAATAATTT AATTTGCCAA TTCTAAAAAA 
CATGGAGTTA AACCCCATGA AAACTTGGTT CCATAGTTTT AACTGTTTTA TGGTTCCAAT 
ACAAAACCAG AGTGGTTTAC ATTCCACAAT MACCAAATTT GCATCCAATN TTGGGGTAAT 
TTTNGGTATT TGCCATGGGA TACTATTCAT TTTT 

(2) INFORMATION FOR SEQ ID NO: 59: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 1215 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : double 

(D) TOPOLOGY: linear 

(Xi) SEQUENCE DESCRIPTION: SEQ ID NO: 59: 
AGAGGAAGTC TTTTGCCAAG CCTGTTCTCT GGACTAACGC CATCCAGGCT GGGAGGGGAA 
GAGTGCTCTG CTACACTCGT CCCCCTCCTG CCTCATCTTC CTTCTCAGCC TTGGTTCCTG 
ATGGGAACAG AATGGAGGGC CTGAGAACAT ACTTTCTAAA TGCCTTTGAC CCAGGAACCG 
ATT ATC TATA TTTGTTCCCA TTTTCCTTCA CCGTGACATT CCAGCATTGT CTGACTGTGA 
GGTGGGCCTT TGAGAGCCTC CAGGTTCCTC AAAACAGGCC TGAGCGATG3 GCATCACACC 
CTCTGCCTAC CCACRTGCCT GCTTACCTGC CAGATAACCA AGTGNAGATG TCTGCGAGTG 
GCTAGTTTTC ACATTCTTAC TAGTGTTTGG YTCACCTTTG GGCAAAGGCC CCCTCTAGGC 
CTTGCCCCAC CTCCATCAAA CGCAGACACT GTAGTCAGAC CTCAGYAATA TAGGAGGCAA 
TAATCTTTTA ACAGTGTTTT GCAAACAAAC AAAAAGAGAA AAATCCCAGC CAGGGGAACT 
CGCCACCTGC C CACGCTAGT TCCATCCACG CTCAAGACCC GCCCTTAGAC CAGGCAGGCA 
AAGGCCCCCA TCACACTCGG CCACTAGTGG GGTCCTGAGG CCAAGAAAGA AACCAGACCC 
TGTATGACAA GTTGGGKTCT TTCCAGAACA CGACAGAAAC AGGGGGGGCC CCTTGTTAAT 
GCCACTCCAT ACTCCAGAAG CATTATTCCT TATTTGGGAC AGCCAAGGGC AGATTCACAG 
GTTATTGTAG GAATAAAGAC TAGTTTACAA AGGARAAAGA GSCCCTGGAC TTCCCMAGGA 
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AAGGTCAGGT TA5GGCTCCT GTACCCATTC 

CCCACCAGGA atgccgtttc ctttttatgg 

5 

TCAATGACAT aggatccgaa gtgcaatgat 

CTGKACAGCA AGGTATTGGT AGGTTACTCA 
10 TTAGGAACCG CT3TTTGNAT TTCTTTTTTT 
GGCTTTCGGA ATTCCTGCAG GAAAG AAAT C 
AAAAAAATAG ACTCG 

15 



TGTTCCACCA CTGTTTGATC TCTCTGGCCT ?CC 

ATCTGTTGGG AACCAGAGAG AATCAACAGA 96 C 

AGTCACTTCT AGTTT GGC AT TTCACAAACT 102 C 

ATTTCAAAAG GGCCCCATGG CCAAATATGT 1C8C 

GGAGACGCAT TGTATATAAT ATATGTCAAA 114C 

AGCTTTGTTA AATCCNAAAA AAAAAAAAAA 1200 

1215 



(2) INFORMATION FOR SEQ ID NO: 60: 

20 

(i) SEQUENCE CHARACTERISTICS : 

(A) LENGTH : 478 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : double 
25 (D) TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 60: 

ATTTCTTATG ACATGGGGGT TTGAATTGGT TGGCAAATGT TTAATTTTAA TATCCATAAT 60 

30 

CAGTGAGGTC CTGCTGGCTG TAATCATTAA TTGTGAAATC TAAGGAGCTT AGTTCATGGC 120 

TCTAGAATTT CACAGAAAAR TGYGMTATGA TACGAGCATT AAGTTTATTT CTTCTGATCT 18 0 

35 TTGATGCAGC TTTGTTCAGT TTATCTGTTT TTGTATTTAT TGGTCATCTA CTTCCCATGC 240 

CAAAAGGGAC TGGTCTACAT AGCTjGCGCTA AACACCTGAT CAAATCACTA AAAGAAAATG 300 

TGTTACCTCT AATGAATTAT CCTGATTGTA AGTTAAAAAT CAATATTTCC CCGTAGTGAG 360 

40 

GTTTGCTTTT TAAAAAGAAK KCTTAAAAAA AAAAAAAAAA AAACGAGTTN AAGAAAAGGA 4 20 

AGCAAGCTCA GGTAAGGTGC ACACATTGGG CTAAGGAAGC TAGAGCCTGT GGAGANGC 4 78 

45 

(2) INFORMATION FOR SEQ ID NO: 61: 

50 (i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 618 base pairs 
(3) TYPE nucleic acid 

(C) STRANDEDNESS: double 

(D) TOPOLOGY': linear 

55 

(xi) SEQUENCE DESCRIPTION : SEQ ID NO: 61: 
TATGACCTTG ATAACCCCAA GTTOGAAATT AACCTTCANT AAAGGGAACA AAAGCTGGAG 60 
60 TTCGCGCGCT TGCAGTTCGA CACTAGTGGA TCCCAAAGAA TTCGGC AC GA GTCATAATGA 120 
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GCTACTAGG7 AAGCCTTCTG GGACTTTCAG 
AIATC-CTGTG GACCCTTGGC CA7CAAA7GG 
G-jTCATGTCA GTCAGGCGTC TTTTTAGTAT 
T3TCGGGAGC CGTGGTGGTA T3GAGGAGGA 
C GAGCATAAA CAAGCCAAGG GGAAAAGGCA 
TGTGACTTAC TGCTGACTGT GTGGATTAGC 
CATTATCTTT GAGCCAGAAG AGTGAGCACT 
TAGGACCNCA AGGCTTCTTN CNGGGGAGAC 
AGTTTTGtGGA AGCAAGGG 



320 

ATA7TTTGGG GAAGATTGAT TTTTC-TTCTT 
TAT 3G33AAG "TCATCCGTC TGTCTOTCAT 
TTAGTGG3T3 CTCAG7ACTG TGCGAGATGC 
GTGCTOIAGA GGACTCTGCT GTGTGG3AG3 
GGC ATGGAA I AAAGGGGGAG AATACCAGTG 
CTATCAGCAG TAAT 3 AAGC A GGG0GGA3O3 
GGSGCGAGGG TGGAGCATCA AGAGGGGGTG 
AACGTGAATA AGCNGTCAGT AGTCACCGA3 



(2) INFORMATION FOR SEQ ID NO: 62: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 751 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: double 

(D) TOPOLOGY: linear 

(XI) SEQUENCE DESCRIPTION: SEQ ID NO: 62: 
TCGACCCACG CGTCCGAGGA GCTGGACTTC TGAGAC AG CC ATTCTCCTTG CATA3CACTG 
TCTGCTGCTA CAGCTCATAG AAGTCAACAA TTTTCTTCAA CACTGGTAGG CAGCCTCTAA 
ATGGCCCTGA TCACCCTCAC CTCCTGCCAT TCACAC CNNT GTAAAATTCC ACCCCTGGAC 
CTAGTGACTC ACTTCTAACA ANGAGAATAC AGCAAAAGTA ACATCGCTTC TGAGGTGAGG 
CTACAAGGAG ACTACGATGC CTGCCTTGGT CACCCTTCTC CTGCTCTTTC CATTGCTCCC 
TCTGATGGAA C^CAGTTGCC ATGTGATGAG GTGCCCTATG GAGAGGCCCA CGTGACAAGG 
TATTGTAAAA AGCCTCTGAC CAATAGCCAT CTAGAAACGG AGGCCCAGTC CAGCAGCCTC 
TGAGATGAAT CCTGCCAACC TGAGCTTGGA GACAGATTCT CTCCCTATCC TGCCTTGGGA 
TGATCACAGC CACCACCAAC ACCTTCACTG CCTGGTGAGA GGCCAAGCCA GTGAACCCAA 
GGTAAACTGG ACAGAATCCT GACCCACAGA AACTGAGATA ATGTTTGTTA TTTTAAGCTG 
CTCAGTTTGT TACAGAGCAA TAGATAACTA ACTCAAACAC CATAAAATTC TAATATTTTA 
TTCTATCACA CAAACCAGGT AATACCAAGT AAATGCCATT ACTATACACA TATTTTTGTA 
ACACAATTAC ATGTGATTTT TTAAGAAGGC T 
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15 



;2> r.sz?yj--zzr. tzr se; id no: 63: 

5 (A) LENGTH: 78C case pairs 

(35 TT?I : nucleic acid 

f . c ) ST?--: 3 EC!- rz s s : doub 1 e 
{~ TCPCICC/: linear 

10 EECUE>ICE DESCRIPTION: SEQ ID MO: 63: 

c^;c^rcA cd xttccccga tccccgggtc gacccacgcg tccgggttgg caactcctga 60 

GGCCTG-DATG GGTGACTTCA CATTTTCCTA CCTCTCCTTC TAATCTCTTC " fAGAGCACCT 
GCT A7CDCCA ACTTCT AG AC CTX3CTC CAAA CTAGTGACTA GGATAGAATT TGATCCCCTA 
ACTCACTG7C 7GCQGTGCTC A7TGCTGCTA ACAGCATTGC CTGTGCTCTC CTCTCAGGGG 
20 CACCATGCTA ACGGGGCGAC GTCCTAATCC AACTGGGAGA AGCCTCAGTG GTGGAATTCC 
AG3CACTCTG ACTGTCAAGC 7GGCAAGGGC CAGGATTGGG GGAATGGAGC TGGGGCTTAG 
CTGGGAGGTo GTCTGAAGCA GACAGGGAAT GGGAGAGGAG GATGGGAAGT AGACAGTGGC 420 

25 

TGG7A7GGC7 CTGAGGCTCC CTGGGGCCTG CTCAAGCTCC TCCTGCTCCT TGCTGTTTTC 480 
TGADGATTTG C-3GGCTTCGG A3TCCCTT7G TCCTCATCTG AGACTGAAAT GTGGGGATCC 
30 AGGA7G>GCC7 TCCTGGCTCT 7ACCCTTCCT CCCTCAGCCT GCAACCTCTA TCCTGGAACC 

TG7CCTCCCT 7TCTCCCCAA CTATGCA7CT GTTGTCTGCT CCTCTGCAAA GGCCAGCCAG 660 



35 



40 



50 



CTTGGGAGCA. GCAGAGAAAT AAACAGCA.TT TCTGATGCCA AAAAAAAAAA AAAAAAAACC 
GCGGCCGAAA GDTTATTNCC CTTTAAGTAA GGGGTTAATT TTTAGCTTGG GCACTNGGCC 

(2) HZ?C?yJClDU EOR SEQ ID NO: 64: 



( i ) SEQUENCE OftHACTERI STICS : 

(A) LENGTH: 588 base pairs 
45 (3) TYPE: nucleic acid 

(C) STRATIDEDNESS : double 

(D) TCPOLCGY: linear 



120 
180 
240 
300 
350 



540 
600 



720 
780 



(zi) SEQUENCE DESCRIPTION: SEQ ID NO: 64: 
TTGCGAATTA ATCGACTCAC TATAGGAAWT GCCGTCGCCA TGACCCGCGG TAACCAGCGT 60 
GAGCTCGCCC GCCAGAAGAA TATGAAAAAG CAGAGCGACT CGGTTAAGGC AAAGCGCCGA 120 
55 GA7GADGGGC TTTGTGCTGC CGCCCGCAAG CAGAGGGACT CGGAGATCAT GCAGCAGAAG 
CAGA.i.AAAjGD CA A-ACGAGAA GAAGGAGGAA CCCAAGTAGC TTTGTGGCTT CGTGTCCAAC 



180 
240 



TGCCTGGA.GC CAGTCCCACC ACGCTCGCGT TTCCTCCTGT 3C0 



60 
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AGTGCTCACA 3GTCCCAGCA CCGATGGCAT TCCCTTTGCC CTGAGTCTOC AGCGGGTCCC 
TTTTGTGCTT CCTTCCCCTC A3GTAGCCTC TCTCCCCCTG GGCCAC7CCC GGGGGTGAGG 
GGGTTACCCC TTCCCAGTGT TTTTTATTCC 7GTGG3GCTC AC CCCAAAGT ATTAAAAGTA 
GCTTTGTAAT TCCAAAAAAA AAAAAAAAAA AAAAAAAAAA AAAAAAAAAA AAAAAAAAAA 
AAAAAAAAAA AAAAAAAAAA AAAANNCGGG GGGGG3CCCC CCCCCCCC 

(2) INFORMATION FOR SEQ ID NO: 65: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH : 774 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : double 

(D) TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTION : SEQ ID NO: 65: 
TTTAAAGATG AAGAAATGAC AAGGGAGGGA GATGAGATGG AAAGGTGTTT GGAAGAGATA 
AGGGGTCTRA GAAAGAAATT TAGGGCTCTG CATTCTAAC C ATAGGCATTC TCGGGACCGT 
CCTTATCCCA TTTAATTAAT TTCTCTGACA ATTCAATTAT TTTCTGTTAT TAATGTTGCC 
ACTGCTTTCT GTTTGTCTGC ACTTTCTTGA TAAATATTTG CTATCGTTTT ACTCCAGTCA 
TTCGATGTTG CTGAGATTTA CATATGACTC TTGTCAACAT CTCATCTTTT GACCCAATCT 
TATTCATTTA ATAAGAGGTC TCATTCATTT GCATGGAAAA ATGCTCATTG TATATTGCAA 
AGTGAAAATA ACGAGTTGCA AAACAGTGTA TACATATATG TGTGTATATA TGTACACTTT 
ATTTGTACAT TTCTATGTGA CATAATGCAA AGGAAAGTGT CTGATTTTAT TATACACCAA 
AGGTTAACAG TGAATCTCTG TGTGATCTCT TTTTTTTTCT TTTTGCCTAT CTGCATCTTC 
TCACTTGCCA AAAAATGAAT ATATGTTTAT GTGTGTATAT TACTTGTGTC ACAAAAAACC 
CTAAAGTAGA CAGTAAAAGA ACTTGTCAAT CGCCTTTGGA AGGCAATGAA ACACTTAATA 
AACTCTCAAT AACAGAAGCG TAAAAATGAA ATGTAAACCT CCAATTACCT CTGGATCTCT 
TAGCCAGAGT AATAAACTGG TAATTATTAC AGATAAAAAA AAAAAAAAAA AANA 

(2) INFORMATION FOR SEQ ID NO: 66: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 1B66 base pairs 

(B) TYPE: nucleic acid 

{ C ) STRANDEDNES S : doub 1 e 
(D) TOPOLOGY : linear 
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(XI ) 


s equenc e desc ft : on : 


SEQ ID NO: 


66 : 








ACCCACGCGT 


CCGGTCCTCT 


TCTTCAGCAC 


ATGCCAAAGC 


TGTTCCTCAC 


GGCCT< jTGAG 


60 


5 


ACAAGAGCAT 


CTTGGATGTA 


Q3ACAATGGA 


AGAGTTAGAT 




AGGAA'TTGGA 


120 




ACGCTCCACC 


CTTCAGGACA 


GTGATGAATA 


TTCCAACCCA 


GCTCCTCTTC 


CCCTGGATCA 


ISO 


10 


GCATTCCAGA 
CACAAGTCCC 


AAGGAGACTA 
TTGCCGGCGC 


a:cttgatga 

ANTCGTGTAT 


GACTTCGGAG 
ACTACCAATA 


ATCCTTTCTA 
TCCAGGAGCT 


TTCAGGATAA 
GAATGTCTAC 


243 
300 




AGTGAAGCCC 


AAGAGCCAAA 


G3AATCACCA 


CCACCTTCTA 


AAAIXJTCAGC 


AGCTGCTCAG 


360 


15 


TTGGATGAGC 


TCATGGCTCA 


CCTGACTGAG 


ATGCAGGCCA AGGTTGCAGT 


GA3AGCAGAT 


420 




GCTGGCAAGA 


AGCACTTACC 


AGACAAGCAG 


GATCACAAGG 


CCTCCCTGGA 


CTCAATGCTT 


480 


20 


GGGGGTCTSG 
GCATCCTGCG 


AGCAGGAATT 
AGAAACCGAT 


GCAGGACCTT 
TGCTGGGAAG 


GGCATTGCCA 
GTGATCCATG 


CAGTGCCCAA 
CTCTAGGGCA 


GGGCCATTGT 
ATCATGGCAT 


540 

600 




CCTGAGCATT 


TTGTCTGTAC 


TCATTGCAAA 


GAAGAGATTG 


GCTCCAGTGC 


CTTCTTTGAG 


660 


25 


CGGAGTGGCT 


TGGNCTACTG 


CCCCAACGAC 


TACCACCAAC 


TTTTTTCTCC 


AC 3CTGTGCT 


720 




TACTGCGCTG 


CTCGCATCCT 


GGATAAAGTG 


CTGACAGCAA 


TGAACCAGAC 


CTGGCACCCA 


780 


30 


GAGCACTTCT 
GACAAGAAGC 


TCTGCTCTCA 
CATATTGCCG 


CTGCGGAGAG 
AAAGGATTTC 


GTGTTTGGTG 
TTAGCCATGT 


CAGAAGGCTT 
TCTCACCCAA 


TCATGAGAAG 
GTGTGGTGGC 


840 

900 




TGCAATCGCC 


CAGTGTTGGA 


AAACTACCTT 


TCAGCCATGG 


ACACTGTCTG 


GCACCCAGAG 


960 


35 


TGCTTTGTTT 


GTGGGGACTG 


CTTCACCAGT 


TTTTCTACTG 


GCTCCTTCTT 


TGAACTGGAT 


1020 




GGACGTCCAT 


TCTGTGAGCT 


CCATTACCAT 


CACCGCCGGG 


GAACGCTCTG 


CCATGGGTGT 


1080 


40 


GGGCAGCCCA TCACTGGCGG TTGTATCAGT 
TTTGTGTGTG CTTTCTGCCT GACACAGTTG 


GCCATGGGGT ACAAGTTCCA 
TCGAAGGGCA TTTTCAGGGA 


TCCTGAGCAC 
GCAGAATGAC 


1140 
12C0 




AAGACCTATT 


GTCAACCTTG 


CTTCAATAAG 


CTCTTCCCAC 


TGTAATGCCA 


ACTGATCCAT 


1260 


45 


AGCCTCTTCA 


GATTCCTTAT 


AAAATTTAAA 


CCAAGAGAGG 


AGAGGAAAGG 


GTAAATTTTC 


1320 




TGTTACTGAC 


CTTCTGCTTA 


ATAGTCTTAT 


AGAAAAAGGA 


AAGGTGATGA 


GCAAATAAAG 


1380 


50 


GAACTTGTAG 
AATTCTATAA 


ACTTTACATG 
ATTCTCTTTC 


ACTAGGCTGA 
TCCCTCTCTT 


TAATCTTATT TTTTAGGCTT 
CTCCAATCAA GCACTTGGAG 


CTATACAGTT 
TTAGATCTAG 


1440 

1500 




GTCCTTCTAT 


CTCGTCCCTC 


TACAGATGTA TTTTCCACTT 


GCATAATTCA 


TGCCAACACT 


1560 


55 






TTTTCACCTC 


TAGTGATGGC 




1620 


GGTTTTCTTA 










TTTGGTCCTG 


ATACriGTTT 


CTTTTCACGT 


TTTCCCATTT 


CCCTGTGGCT 


CACTGTCTTA 


1680 




CAATCACTGC 


TGTGGAATGA 


TGATAC CACT 


TTTAGCTCTT 


TGCATCTTCC 


TTCAGTGTAT 


1740 



60 
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35 



45 



55 



TTTTGTTTTT CAAGAGGAAG TAGATTTTAA CTGGACAACT TTGAGTACTG ACATCATTGA 1300 

TAAATAAACT GGCTTGTGGT TTCAATAAAA AAAAAAAAAA AAAAAAAAAA AAAAAAAAAA 136 0 

5 AAAAAA 1866 

10 (2) INFORMATION FOR SEQ ID MO: 67: 

(i) SEQUENCE CHARACTERISTICS : 

(A) LENGTH: 1152 base pairs 
(E) TYPE: nucleic acid 
15 (C) STRANDECNESS: double 

(D) TOPOLOGY : linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 67: 

20 CTCAAGGATG TAAAGGCTCT GCAGATTTCG GGAGGCCTGT CTCCCAGCAC CTGATGGGAC 60 

ACTTTTTGCC CCACTGTAAA TTCTGGGTGT ATCCTCCACT GTATGCTGTC ACCCCAAGGG 120 

CAAGCACTGC ATCTGCTTAG TGAAGGATTT ATTGTTCGGA AGATACATTT TCCCCTTKAG 180 

CAGAGAGTGG CGTATCCTGG CAGTCTTCGG TGAGCCAGTT GTACCAGGAT TATGAAATGC 240 

AGATGTTTAC TGTGTCATTG TTGCTGTCAT TGC T ACTGAG GAGTACTGAC CAGAATCATC 300 

30 TGCAACTYTT AGTTGGCAGA GAGGACCACT ATGGCGGGTA GCTCTTTTCT TTCCTGCCAT 360 

TGTGGGGATG ATTCCAGGCC AAAGATGATG GARAAGTATG GAAATCATCT GAAAGGTTGA 420 

AGCTTGGCAC GTGAAGCCAT TCATGACTTT GTAAGGCAGT TTTGCTGAAG GCCAGTTCTG 480 

CCCTGGGAGG G AC GGAGGTG AATC CTCCTG AGTACCTGTG GTTTTCTTAC TTCCTGCTGA 540 

ATTTACCTAA GTGCCTGTTG TTTGCTTGCT GTGGAGGCTT TCTGGTATTT CATTTCAGGT 600 

40 GCAGATGCCT TCACTTTCCC ACCRAAAAAA CCCCMACCAA ACCTAAGACC TTACTGCAAC 660 

TAAGTYTNCC AAGTACTTTT TAACCCAATG GGATGAACAG CCTGTGGTCT GCTCAGATCA 720 

CCCTGAGTGC GTGTGAGAAG GCMTNGGCTT TGCCAGGAAA TCCAGGAAGG CAGGGCCGGG 780 

CTGTGTTGGA AGCTGGCTTA GCTGGTGGGG CAGCCTTATT TCAATTAAAA GGGCATTGAC 840 

TGGGAGCAGC AGTCCTGGAG TTTGTTGCAT TTCCTATTGC CCTCAAAATG AGAAACCAGG 900 

50 AAAATAGCAG ATTGGAGCCT TCGAGAAGGC AGTAAATGGC TGTTTTTATT GACAAAAGGA 960 

AAACATTTTA CTGCCATCTC ACTGATGGCA TCTCACTGAC TTAAAATGAA GGCANGTTGT 1020 

AGTAAAAAAA AAAGTCTACA TTTTTCCACC GCCACGTTCT TATATCCTGT TTGTCAGCCA 1080 

CTGCTCANAA GGGCATGTTG TCTTGCGGAN TANAGGCGCT CTCCTTCCCT CGTTTTCCCT 1140 

ATAGGTTGGG TG 1152 



60 



WO 98/54963 




:T/1!S98/11422 





(2; INFORMATION FOR SEQ ID NO: 63: 






5 

10 


(i) SEQUENCE CHARACTERISTICS : 

(A) LENGTH : 24 3 3 base pairs 
(3) TYPE: nucleic acid 

(C) STRANDEDNE5S : double 

(D) TOPOLOGY : linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 68: 








AGCAGGCGGT GCGCTGGGGG CGGGAGCAGC GCGKAGCCCG GCTCGGCCAC 


ACCGATCGCC 


60 


15 


CGCCGCCATG GGCTCCTCGC AAAGCGTCGA GATCCCGGGC GGGGGCACCG 


AGGGCTACCA 


120 




CGTTCTGCGG GTACAAGAAA ATTCCCCAGG ACACAGAGCT GGTTTGGAGC 


CTTTCTTTGA 


180 


20 


TTTTATTGTT TCTATTAATG GTTCAAGATT AAATAAAGAC AATGACACTC 
GCTGAAASCA AACGTTGAAA AGCCTGTAAA GATGCTTATC TATAGCAGCA 


TTAAGGATCT 
AAACATTGGA 


240 

300 




ACTGCGAGAG ACCTCAGTCA CACCAAGTAA CCTGTGGGGC GGCCAGGGCT 


TATTGGGAGT 


360 


25 


GAGCATTCGT TTCTGCAGCT TTGATGGGGC AAATGAAAAT GTTTGGCATG 


TGCTGGAGGT 


420 




GGAATCAAAT TCTCCTGCAG CACTGGCAGG TCTTAGACCA CACAGTGATT 


ATATAATTGG 


480 


30 


AGCAGATACA GTCATGAATG AGTCTGAAGA TCTATTCAGC CTTATCGAAA 
AAAACCATTG AAACTGTATG TGTACAACAC AGACACTGAT AACTGTCGAG 


CACATGAAGC 
AAGTGATTAT 


540 
600 




TACACCAAAT TCTGCATGGG GTGGAGAAGG CAGCCTAGGA TGTGGCATTG 


GATATGGTTA 


660 


35 


TTTGCATCGA ATACCTACAC GCCCATTTGA GGAAGGAAAG AAAATTTCTC 


TTCCAGGACA 


720 




AATGGCTGGT ACACCTATTA CACCTCTTAA AGATGGGTTT ACAGAGGTCC 


AGCTGTCCTC 


780 


40 


AGTTAATCCC CCGTCTTTGT CACCACCAGG AACTACAGGA ATTGAACAGA 
ACTTTCTATT AGCTCAACTC CACCAGCTGT CAGTAGTGTT CTCAGTACAG 


GTCTGACTGG 
GTGT AC CAAC 


840 

900 




AGTACCGTTA TTGCCACCAC AAGTAAACCA GTCCCTCACT TCTGTGCCAC 


CAATGAATCC 


960 


45 


AGCTACTACA TTACCAGGTC TGATGCCTTT ACCAGCAGGA CTGCCCAACC 


TCCCCAACCT 


1020 




CAACCTCAAC CTCCCAGCAC CACACATCAT GCCAGGGGTT GGCTTACCAG AACTTGTAAA 


1080 


50 


CCCAGGTCTG CCACCTCTTC CTTCCATGCC TCCCCGAAAC TTACCTGGCA 
CCCCCTGCCA TCCGAGTTCC TCCCGTCATT CCCCTTGGTT CCAGAGAGCT 


TTGCACCTCT 
CTTCTGCAGC 


1140 
1200 




AAGCTCAGGA GAGCTGCTGT CTTCCCTCCC GCCCACCAGC AACGCACCCT 


CTGACCCTGC 


1260 


55 


CACAACTACT GCAAAGGCAG ACGCTGCCTC CTCACTCACT GTGGATGTGA 


CGCCCCCCAC 


1320 




TGCCAAGGCC CCCACCACCG TTGAGGACAG AGTCGGCGAC TCCACCCCAG 


TCAGCGAGAA 


1380 




GCCTGTTTCT GCGGCTGTGG ATGCCAATGC TTCTGAGTCA CCTTAACTTT 


GAACCATTCT 


1440 
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326 

TTGGAATT GG CGTGGTA7AT TTAACCACGG GAGCGTGTCT GGAAACGCAA ACTATCATTA 15C3 

ATTTCATACT AGTTTGTACC GTATCTGTAG 'XATCCTG7A AAT AATTC CA AGGGGAAAAC 15 60 

5 TAAACGAGGA CGTGGGTTGT ATCCTGCCAG 3TTGAGTGGG GCTCACACGC TAGGGTGAGA 1620 

TGTGAGAAAG CGCTTGTATT TTAAACAAC: AAAAAGAATT GTAAGGGTGG CTTGCTGCCA 16 BO 

GGCTTGCACT GCCGTTCCTG GGGGTGTGCA TCTTCGGGAA AGGTGGTGGC GGGGCGTCCA 1740 

10 

CTAGGTTTCC TGTCCCCTGC TGCTCCTTCC GTAAGAAAAT GAAATATTCT ATGCCTAATA 180C 

CTCACACGCA ACATTTCTTG TACTTTGTAA GTCGTTTGCG AG AAT GCAGA CCACCTCACT 136 0 

15 AAACTGTAAA CGGTAAAGAG ATTTTTACTT TTGGTCTCCG TGAGTCGC AT CTCTACTAAG 1920 

GTTTACACAG GAATTCCACC TGAAGACTTG TGTTAAAGTT CTACAGCGCG CACTGTTAAC 1980 

TGAACGTCTT TTTCTTCAGC CTATACGCGG ATCCTTGTTT TGAGCTCTCA GAATCACTCA 2040 

20 

GACAACATTT TGTAACTGCT GCTGTTGCTT TCTACATACA CCTTATAAAG TGACATTTCA 2100 

AAAGAAATAA GGTGCCACAG TTTTAAACCA GAAGGTGGCA CTCTGTGGCT CCTTGTAGTA 2160 

25 TTATAGCTAT ACTGGGAAAG CATAGATACA GCAATAAAGT ACAGTAATTT TACTTTTTTT 2220 

CTTGTGTTAC ATCTAAATTA CAACCCTTAA TTGCCACGTG TGCACTTACT ACTCTCCAGT 2280 

ATGTCTTATT ACTCTCCAGT ATGTCACGCA TCTTTAACTT TTCACGTCCT ATGTTTGCTT 2340 

30 

TCTCCCATTT TTAAGAGATG GTAAGTTAAC TGGAATTGAT TTACTGAATG AAATTAAATG 2400 

CAGATATCCC TG T T TT TGAA ATAAAAAAAA AAAAAAAAAA AAAAAAAAAA AAAAAAAAAA 2460 

35 AAAAAAAAAA AAAAAAAAAA AAA 2483 

40 (2) INFORMATION FOR SEQ ID NO : 69: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 53 6 base pairs 

(B) TYPE: nucleic acid 
45 (C) STRANDEDNESS: double 

(D) TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 69: 

50 GAGAAATGGA GCTTTGTTAG ATAAAAATTT TTTCAACGCA AACAGTCATT TTCCAGTGAA 60 

AGGAGAGCGT ATCCGCCGTA GGATGGACTT AGATCGTGTA AAAGCTGAGG CCACCGAGGA 12 0 

TATAACCTCC GGGGTCCTTT GCCTCCTTTT CCTTAGACTC CCTCCAAACT CGTGTATCTT 130 

55 

TCCTTCAGCA GTACTGGGCT CCACGCGAAC CTAGTCCTTT GTCTTTACCC TATTACCTTT 24 0 

CATAACATCC TAGTTGAAAA GTARTTATTC AACCGCGTTT GAAAATGAGA ACAGGTTCAC 300 

60 AGARGCTAGG TTACTTGCGA AGGTCGTTCA ATTAGTAACC AGTAACGCCA GGACTGCCAG 360 
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5 



10 



20 



30 



40 



55 



TTTCTTGCTT CC3AATTCTC ATGGTAGCTT TCACCARGCT CCCCGTCMAA TGCTAACGTC 42 3 

AACTACTGAA CTAGATTA3C AAAAAGGTCT TTTAACAGAA TTCCTGGTTT TCAGAGAGAG 43 0 

TTTCTTTCA7 GAAGCGCCCC ATTTCT AC AG AGGAAAATAA ACTCC AAGCA GCCAG7 S3 6 

(2) INFORMATION FOR SEQ ID NO: 70: 



(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 865 base pairs 
15 (3) TYPE: nucleic acid 

(C) STRANDEDNESS : double 

(D) TOPOLOGY: linear 



(xx) SEQUENCE DESCRIPTION: SEQ ID NO: 70: 

CCACGCGTCC OXCTTTCTT GGCCAGAGGC GCCGGTTGGA CTCACGGGCG GGGCATGATG 60 

GGTAACAGGA CCGGTGGGGT CCCCAGGAAG TCCTAGAGGG GGTCGGGGTT TGGGTGGACA 120 

25 AGCTTTCCTC GTCCTCTC CC GACAGAGCTG ACGTGTCCTG GGTTCCACCG GGAGCGGGCA 180 

TTTCCACC GG ACGGGAGGGT TCGGGGTGTC CGC^GGCTGGG GAATACGTAG GGGTTGCCGC 240 

GCGGTGTGGG GAGTTGGGGC GTOTGGCTGC AGTCCCGGGA GTTCTTGGAG GGGGTCGGCC 300 

CACCGAGCTT CCGGACCGGC TGATCTGCCC GTAGCTTGCC GGANGGARGG CGGAGCTGAC 360 

TCTCCGTCCC TTCTCCCATC CCCTCCAGTG GTGGGTACGG GCACCTCGCT GGCGCTCTCC 420 

35 TCCCTCCTGT CCCTGCTGCT CTTTGCTGGG ATGCAGATGT ACAGCCGTCA GCTGGCCTCC 480 

ACCGAGTGGC TCACCATCCA GGGCGGCCTG CTTGGTTCGG GTCTCTTCGT GTTCTCGCTC 540 

ACTGCCTTCA ATAATCTGGA GAATCTTGTC TTTGGCAAAG GATTCCAAGC AAAGATCTTC 600 

CCTGAGATTC TCCTGTGCCT CCTGTTGGCT CTCTTTGCAT CTGGCCTCAT CCACCGAGTC 660 

TGTGTCACCA CCTGCTTCAT CTTCTCCATG GTTGGTCTGT ACTACATCAA CAAGATCTCC 720 

45 TCCACCCTGT ACCAGGCAGC AGCTCCAGTC CTCACACCAG CCAAGGTCAC AGGCAAGAGC 780 

AAGAAGAGAA ACTGACCCTG AATGTTCAAT AAAGTTGATT CTTTGTAAAA AAAAAAAAAA 840 

AAAAAAAAAA AAAAAAAAAA AAAAA 865 

50 



(2) INFORMATION FOR SEQ ID NO: 71: 



(i) SEQUENCE CHARACTERISTICS : 

(A) LENGTH: 932 base pairs 
(3) TYPE: nucleic acid 
(C) STRANDEDNESS : double 
60 (D) TOPOLOGY: linear 
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32S 

!xi) SEQUENCE DESCRIPTION': SEQ ID MO : 71: 
TCATCATATA CAAAGTTTTT CSTCACACTG CAGGGTTGAA ACCAGAAGTT AGTTGCTTTG 
AGAACATAAG GTCTTGTGCA AGAGGAGCCG TCGCTCTTCT GTTCCTTCTC GGCACCACCT 
GGATCTTTGG GGTTCTCCAT GTTGTGCACG CATCAGTGGT TACAGCTTAC CTCTTCACAG 
TCAGCAATGC TTTCCAGGGG ATGTTCATTT TTTTATTCCT GTGTGTTTTA TCTAGAAAGA 
TTCAAGAAGA ATATTACAGA TTGTTCAAAA ATGTCCCCTG TTGTTTTGGA TGTTTAAGGT 
AAACATAGAG AATGGTGGAT AATTACAACT GCACAAAAAT AAAAATTCCA AGCTGTGGAT 
GACCAATGTA TAAAAATGAC TCATCAAATT ATCCAATTAT TAACTACTAG ACAAAAAGTA 
TTTTAAATCA GTTTTTCTGT TTATGCTATA GGAACTGTAG ATAATAAGGT AAAATTATGT 
ATCATATAGA TATACTATGT TTTTCTATGT GAAATAGTTC TGTCAAAAAT AGTATTGCAG 
ATATTTGGAA AGTAATTGGT TTCTCAGGAG TGATATCACT GCACCCAAGG AAAGATTTTC 
TTTCTAACAC GAGAAGTATA TGAATGTCCT GAAGGAAACC ACTGGCTTGA TATTTCTGTG 
ACTCGTGTTG CCTTTGAAAC TAGTCCCCTA CCACCTCGGT AATGAGCTCC ATTACAGAAA 
GTGGAACATA AGAGAATGAA GGGGCAGAAT ATCAAACAGT GAAAAGGGAA TGATAAGATG 
TATTTTGAAT GAACTGTTTT TTCTGTAGAC TAGCTGAGAA ATTGTTGACA TAAAATAAAG 
AATTGAAGAA ACACATTTTA CCATTTAAAA AAAAAAAAAA ACTNGAGGGG GGCCCGGTAC 
CCAAATCGCC GCATAGTGAT CGTAAACAAT CT 

(2) INFORMATION FOR SEQ ID NO: 72: 

(i) SEQUENCE CHARACTERISTICS : 

(A) LENGTH: 996 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : double 

(D) TOPOLOGY : linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 72: 
CGCCTGGCAC CATGAGGACG CCTGGGCCTC TGCCTGTGCT GCTGCTGCTC CTGGCGGGAG 
CCCCCGCCGC GCGGCCCACT CCCCCGACCT GCTACTCCCG CATGCGGGCC CTGAGCCAGG 
AGATCACCCG CGACTTCAAC CTCCTGCAGG TCTCGGAGCC CTCGGAGCCA TGTGTGAGAT 
ACCTGCCCAG GCTGTACCTG GACATACACA ATTACTGTGT GCTGGACAAG CTGCGGGACT 
TTGTGGCCTC GCCCCCGTGT TGGAAAGTGG CCCAGGTAGA TTCCTTGAAG GACAAAGCAC 
GGAAGCTGTA CACCATCATG AACTCGTTCT GCAGGAGAGA TTTGGTATTC CTGTTGGATG 
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25 



(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 785 base pairs 
30 (B) TYPE: nucleic acid 

(C) STRANDEDNESS : double 
<D) TOPOLOGY : linear 



35 



45 



55 



329 



ACTGCAATC-C CTTGGAATAC CCAATCCCAG TGACTACGGT CCTGCCAGAT CGTCAGC3CT 420 

AAGGGAACTG AGACCAGAGA AAGAACCCAA GAGAACTAAA GTTAT3TCAG CTACCCAGAC 48 C 

TTAATGGGCC AGAGCCATGA CCCTCACAGG TCTTGTGTTA GTTGTATCTG AAACTGTTAT 54 3 

GTATCTCTCT ACCTTCTGGA AAACAGGGCT GGTATTCCTA CCCNGGAACC TCCTTTGAGC 600 

ATAGAGTTAG CAACCATGCT TCTCATTCCC TTGACTCATG TCTTGCCAGG ATGGTTAGAT 660 

ACACAGCATO TTGATTTGGT CACCTAAAAA GAAGAAAAGG ACTAACAAGC TTCACTTTTA 72 C 

TGAACAACTA TTTTGAGAAC ATGCACAATA GTATGTTTTT ATTACTGGTT TAATGGAGTA 780 
15 ATGGTACTTT TATTCTTTCT TGATAGAAAC CTGCTTACAT TTAACCAAGC TTCTATTATG 
CCTTTTTCTA ACACAGACTT TCTTCACTGT CTTTCATTTA AAAAGAAATT AATGCTCTTA 
AGATATATAT TTTAYGTAGT GCTGACAGGA CCCACTCTTT CATTGAAAGG TGATGAAAAT 



840 
900 
960 



CAAATAAAGA ATCTCTTCAC ATGARAAAAA AAAAAA 996 



(2) INFORMATION FOR SEQ ID NO: 73: 



60 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 73: 
GGCACGAGGG GCTTTGCGTA CACAATAGCT GCTAGGAGTA CCCAAAGCCT GARTACARCC 

TGCTGGTGTC ATGGCCACGT GTGAGCAGGC CAGCGTCAMA CGGCTCGCTG TGACCCGTCC 120 

40 CGRAGACTGA AATGGGCCTG GGTCTTCTCC TKGTCCTGTG ATWAAAGTCC TCTCTTGAAA 180 

GTGGAGAGCA AAGGCACACA GAGGTGCGCG CTCACAAGAA TTCCTCCCGG TGACTGGGTA 240 

ATCAATGTTA CTGCTGTTTC CTTTGGAGGA AAGACCACAG CAAGATTCTT TCATTCGTCT 300 

CCTCCTAGCC TGGGGGACCA GGCTCGAACT GACCCTGGAC ATCAAAGGAG GGATTATGTG 360 

GCTGCTAAAG CCATCGGCCC ACAGCCCTGT TCACRTCTTG GTGCTTCTCT TTCCCAGAGG 420 
50 CTGGTCCCAG CCAGGCACAC ACAAAAGGCA GATTCTCGTA AACSCAGCCT CCCTCCCTGG 



480 



AGGCTGCCTC CTGCCCTGGA TCTGGAGTGG AC<TIGCTCTG AGATTTTGAG TTCTTCTGCA 540 



6C0 
660 



GAGATGATTA AATATATCCA AGAGACATTG GAAAACCTGC TGAACATTTT ACATTGGTCT 
GCTCAGCACA TGGCTGGATG CGGATATTTC TATAATTCCA GAAAGTCACA CAGCTCCTCT 
GTATGAGACC AGTGGGCGCC ATTTAAAAGA ACAGGATGAG AATCTAAGA? AT ATT ATT AA 720 
60 TAAATGTAAT GGATTTTTTT TTTGTAAAAA AAAAAAAAAA AAAAAAAAAA AAAAAAAAAA 78C 
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AAAAA 



735 



20 



(2) INFORMATION FOR SEQ 13 NC : 74: 

(i) SEQUENCE CHARACTERISTICS: 
10 (A) LENGTH: 1069 base pairs 

(B) TYPE: nucleic acid 

( C ) STRANDEDNES S : doubl e 

(D) TOPOLOGY : linear 

15 ( X i) SEQUENCE DESCRIPTION: SEQ ID NO: 74: 

TCCTCACCAT TCCCCTAGGN CAGGTCCCTG CAGGTCCCAC ACTTCTCCCA GGTCCCTAAA 60 

CTTGGGTCGG TCCTTTCCCT GGAGTAGCTG GNTCCTCCAG TCGAGGTCCC TGTTCAGTCG 120 

GTTCTTAGGC TCCTGCACAT GAAGGTGTGT GCCTGTGGTG TGTGGGCTGC TCTAGGAGCA 180 

GATACAGGCT GGTATAGAGG ATGCAGAAAG GTAGGGCAGT ATGTTTAAGT CCAGACTTGG 240 

25 CACATGGCTA GGGATACTGC TCACTAGCTG TGGAGGTCCT CAGGAGTGGA GAGAATGAGT 300 

AGGAGGGCAG AAGCTTCCAT TTTTGTCCTT CCTAAGACCC TGTTATTTGT GTTATTTCCT 360 

GCCTTTCCGA GTCCTGCAGT GGGCTGCCCT GTACCCTGAA CCTCATGAGC CTCTAAGGGA 420 

AAGGAGGAAC AATTAGGACG TGGCAATGAG ACCTGGCAGG GCAGARTACA AGCCCAGCAC 480 

CAGTGTCCCA GCCTTACTGG GTCCTTACCC TGGGCCAAAC AGGGAGGGCT GATACCTCCT 540 

35 TGCTCTTCCT AGATGCCCAC CTCCTACAAT CTCAGCCCAC AAGTCCTCTC CACCCTAGGG 600 

GGCTTGCTGC ATGGCAATAA CTCATAATCT CATTTGGAGG TTTGCCCTTT ACAGGGGCAG 660 

ATTTTCTGCT CAGTTCAACA ATGAAATGAA GAGGAACTCC CTCTTTCTAC AGCTCACTTC 720 

TATCAGAGGC CCAGGTGCCT CAGAGCCACA TTGAGTTGCT TTTTCTGGGA TGAGGAAGTA 780 

GGGTTAAACT CCCCAGTTTC CTGAGGGAGG CTCCTGACAG GTGCCCTTTG TCAGACCCTA 840 

45 CCACAGCCTG GATAGGCAGC CACATTGGTC CTCGCCCTTG CTCGGNACTC CGTGGTGGTC 900 

CTGCCCTTCT CCCTGCATGC CTGTGGGTCT GCTCTGGTGT GTGAAGGTCG GTGGGTTAAC 960 

TGTGTGCCTA CTGAACCTGG CAAATAAACA TCACCCTGCA AAGCCAAAAA AAAAAAAAAA 1020 

AAAAAAAAAA AAAAAAAAAA AAAAAAAAAA AAAAAAAAAA AAAAAAAAA 1069 



30 



40 



50 



55 



(2) INFORMATION FOR SEQ ID NO: 75: 



(i) SEQUENCE CHARACTERISTICS : 

(A) LENGTH : 831 base pairs 
60 ( B) TYPE : nucleic acid 
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{ Z ) STRANDEDNES3 : doub I e 
ID) TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTION : SEQ ID NO: "5: 

5 

GGACATTAGA TCACTGTGGA CCTAAAACAA ACAAACAACT ATAAG3AAAA TGGCATTAGA 60 

AATGGTCTGG GGATCAGTTT ATCACTGCAG TTGTTACATC ACCCCATGGT CTAAAATA2A 120 

10 GAGCTTTAGT CTGTCTCTGT TTCAGTTCAT TTTACAGGAG GTGAACATCA CACTTCCAGA 130 

AAACTCTGTC TGGTATGAAA GGTATAAATT TGATATTCCT GTCTTTCACT TGAATGGCCA 240 

GTTTCTGA1G ATGCATCGAG TAAACACCTC AAAACTTGAA AAACAGCTCC TGAAAC7TGA 3 00 

15 

GCAGCAAAGT ACTGGARGCT GACTGATGCC CTCATGATTT TCCACCCTCT CTTCCCATAA 360 

AGCATCTTCC TAAGGAAATG AMCATGGCCT GAT AC TC ATT TTGTCATTTG TACAGAGCCC 42 0 

20 TAAGGATGTT CTGAATTCAG TGGTGCCAAA TAAATGTTGA CATTCCCCTT TTGGTTGATG 48 C 

GAAGTATCAG TGTGGGAACT GTTTGCTTAA TGGCATTTTA TAAAATAAKA AKAKCATATT 54 0 

AGCAGGGAGG GAGATGATGG AGGGAGGGAG AAGTCCATTT GTCTTATTTA TCCTTTTTGT 600 

25 

ATTAATAGAG AAGCACTTCA CAGTCACTGG CAATGCCATT TATAGGAAGA AGGTTCTGCA 66 0 

TTCCTGCTGC TCCCGGAGGG CTTAACTTTT TAATGAAAGA ATAAATGCTC TTCCACTCAG 720 

30 TAGATAAAGT GAAATGTGAA TTGTTAATAA CTGTGCACGG TCAATAAAGC GATGTTTTAA 780 

GGAATACAAA AAAAAAAAAA AAAAAAAAAA AAAAAAAAAA AAAAAACTCG A B.U 



35 

(2) INFORMATION FOR SEQ ID NO: 76: 

(i) SEQUENCE CHARACTERISTICS: 
40 (A) LENGTH: 590 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : double 

(D) TOPOLOGY: linear 

45 (xi) SEQUENCE DESCRIPTION: SEQ ID NO: 76: 

TATATATAGA CNGTTAATAG TCGTGANTGN TGTGNACGAA CATTAACGGA AGTAGCATGT 60 

AGCCAGTCGA ATAACNTATA AGGACAAAGT GGAGTCCACG CGTGCGGCCG TCTAGACTAG 120 

50 

TGGATCCCCC GGCTGCAGGA TTCGGCACGA GCTGCCAGGT GAGGAGCAGA GAGACTGTTC 180 

CCTTGGGTGG AGAGGTGTGG GCATGAGAGC CACCCATTGC CAAGCAGCAA GAATGTTCGT 240 

55 GCTTTTTTCC CTTCCAAAAT ATGCAGGGCT CAGGCTCCCA ATTCC GGGCC TGTCTGCTTT 3 DO 

GCTTGTGTTT CTCCTGTCCC TGTTCTCCCG GAGGGCCCAG GTGGAACTCA CGACAGGGAG 360 

GGAGACGCTT CCCAAAAACC TGCAGGGCTA TTTCCCAGAA TTTGGTTTTC AAGTACAAAA 420 

60 



WO 98/54963 




ICTVUS98/11422 



CTTTTTGTCC TGTAAGATAT ATGCAGCCTC ACAGAAGCAG :CTCTC-CCT^ TACTTTACCA 4 3C 

GCTACGTTTT TATCTTAAGC ACATGGG3CT CCCTTAGAAC TTACTCCACT 3ATTTAAAAA 540 
5 AAAAAAAAAA AAACTCGAGG GGGGGCCCGG TACCCATTCG CCCTAAAAGT 590 



10 (2) INFORMATION FOR SEQ ID NO: 77: 



(i) SEQUENCE CHARACTERISTICS : 

(A) LENGTH: 1274 base pairs 
{ B ) TYPE: nucleic acid 
15 (C) S7RANDE0NESS: double 

(D) TOPOLOGY: linear 

(Xl) SEQUENCE DESCRIPTION: SEQ ID NO: 77: 

20 GAGCCACCAC ACCTGGCCTG GAAGGAACCT CTTAAAATCA GTTTACGTCT TGTATTTTGT 60 



TCTGTGATGG AGGACACTGG AGAGAGTTGC TATTCCAGTC AATCATGTCG AGTCACTGGA 120 



CTCTGAAAAT CCTATTGGTT CCTTTATTTT ATTTGAGTTT AGAGTTCCCT TCTGGGTTTG 180 

25 

TATTATGTCT GGCAAATGAC CTGGGTTATC ACTTTTCCTC CAGGGTTAGA TCATAGATCT 240 



TGGAAACTCC TTAGAGAGCA TTTTGCTCCT ACCAAGGATC AGATACTGGA GCCCCACATA 300 



30 ATAGATTTCA TTTCACTCTA GCCTACATAG AGCTTTCTGT TGCTGTCTCT TGCCATGCAC 360 



TTGTGCGGTG ATTACACACT TGACAGTACC AGGAGACAAA TGACTTACAG ATCCCCCGAC 420 



ATGCCTCTTC CCCTTGGCAA GCTCAGTTGC CCTGATAGTA GCATGTTTCT GTTTCTGATG 480 

35 

TACCTTTTTT CTCTTCTTCT TTGCATCAGC CAATTCCCAG AATTTCCCCA GGCAATTTGT 540 



AGAGGACCTT TTTGGGGTCC TATATGAGCC ATGTCCTCAA AGCTTTTAAA CCTCCTTGCT 600 



40 CTCCTACAAT ATTCAGTACA TGACCACTGT CATCCTAGAA GGCTTCTGAA AAGAGGGGCA 660 



AGAGCCACTC TGCGCCACAA AC<;TTGGGGT CCATCTTCTC TCCGAGGTTG TGAAAGTTTT 720 



CAAATTGTAC TAATAGGSTG GGGCCCTGAC TTGGCTGTGG GCTTTGGGAG GGGTAAGCTG 780 

45 

CTTTCTAGAT CTCTCCCAGT GAGGCATGGA GGTGTTTCTG AATTTTGTCT ACCTCACAGG B40 



GATGTTGTGA GGCTTGAAAA GGTCAAAAAA TGATGGCCCC TTGAGCTCTT TGTAAGAAAG 900 
50 GTAGATGAAA TATCGGATGT AATCTGAAAA AAAGATAAAA TGTGACTTCC CCTGCTCTGT 960 



GCAGCAGTCG GGCTGGATGC TCTGTGGCCT TTCTTGGGTC CTCATGCCAC CCCACAGCTC 1020 



CCAGGAACCT TGAAGCCAAT CTGGGGGACT TTCAGATGTT TGACAAAGAG GTACCAGGCA 1080 

55 

AACTTCCTGC TACACATGCC CTGAATGAAT TGCTAAATTT CAAAGGAAAT GGACCCTGCT 1140 



TTTAAGGATG TACAAAAGTA TGTCTGCATC GATGTCTGTA CTGTAAATTT CTAATTTATC 12C0 
60 ACTGTACAAA GAAAACCCCT TGCTATTTAA TTTTGTATTA AAGGAAAATA AAGTTTTGTT 1260 
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TGTTAAAAAA AAAA 127 4 

5 

i2; INFORMATION FOR SEQ ID NO : 78: 

(l) SEQUENCE CHARACTERISTICS: 
10 (A) LENGTH: 1133 base pairs 

(3) TYPE: nucleic acid 

(C) ST HANDEDNESS : double 

(D) TOPOLOGY: linear 

15 (xi) SEQUENCE DESCRIPTION : SEQ ID NO: 78: 

AGGATTTTTC CTTGTTCAAC CAAAATCTGA GCATTCTTTC TATGTTGAAA ACACTGAAAA 60 

ACTAATTTWA GTTAATGAAC TAGAAAGAAT ATTGATTTTW AAGAAACAGA AAAATACTAC 120 

20 

TTATTTTCCT TCTCAAATAA CGTTTCTTTC AAAAACTTCT GGCTGAAGTA TAACATGCTG 180 

GTAGTTAACA TAAATCTTGT CTTTCTCTTG TTCTTTATCT TTCTTTGTTA TTTAGATGCT 240 

25 TGTATAAATG TCTTTTGTTT TTATTAAGTG CCTAATTGAC AGAGCTTAAT TTGAAGAAGT 300 

GCCCTAATTT ATTGACCACT TAAGAATTGC CTTTATTGGG GTATTTTATT TGTTCCTGCG 360 

TCTTTTTGAT GTTGTTCAGT CTACTCATCC CTGTGAGTAT GTGTGGGGGA CAGCTGATAG 420 

30 

AAGGGAGGAG AGTGTGTCTA TGCTCAGGAT TGCCCTTTAG CCACTCAGCC AGAGATCCAC 480 

AGGGAGCAAC AAGGACAGTT TCACATGCTT AGACTTTCTT GGAAGAAACA GTGAGGAGGA 540 

35 GTAAGTCGTG AGTAGTGTCA AGCTGGATGT AGAATTGTCC TAAGGCAGTT GACCCCACCT 600 

TCCAACATGT TTTCACTTTA TTTGCCCCTC CCTACATTTG GGTTAGGTTC CATTTGGATT 660 

TGCAGCAATA ATGACTTTAT TTCTCTCTTG GTCAGGATTT GGCACATAAA ATCCTTTTAT 720 

40 

TATAGAACTA GCTATTTTAG TTACATAGTA ATGTAACTAA TGGAGAGATT TATAGAGAAT 780 

TTTGKTTTTG CTGTCATATA TGTCCATTTT GGAGACAGAT ATGATAGAAC TAGAAATTAA 840 

45 GTTGCATTTC TGCAAGTGCC ATTTGAATGA ACTTCAAGTA TCTTCTTAAT TATTAAATTT 900 

TCTGATGAAG GCATTGTAAC AAATATATAG TATTATTAAA TCTAATTAAT ATTTGGAAAT 960 

ATTAATAAAT AGGTATTTTA TTTACTGTAA AAAGTCAAAC TTCATTATGT AGATAAATCT 1020 

50 

TATTCTTTTC ATTCTTTCCC CTGTTTACAT CCTTTTTACA AAGCTTAGTC ACCAATTAAA 108C 

GCTTTCCTAT CAAAAAAAAA AAAAAAAAAA ACTCGAGACT AGTTCTCTCT CCT 1133 

55 

(2) INFORMATION FOR SEQ ID NO: 79: 
60 (l) SEQUENCE CHARACTERISTICS: 
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(A) LENGTH: 661 base pairs 
1 5) TYPE: nucleic acid 

(C) STRANDEDNES3 : double 

(D) TOPOLOGY: linear 

5 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 79: 
GAATTCGGCA CGAGGGGAAA AGGATGCTGA ACGAGAGCAG AAAGCCTCTT TCCTTTGCTT 60 
10 CACGCCTTTC CAGTCTTTAT TTTAAACTCG GGTTCCCTTT CTGTGGTCGC AGCAACCTTT 12 3 

ACTCCACCTG CACTGCTGCT CCTGGGGGCT CCCCAGGCCT CCCTCTGCCT TTCTACCCAG 130 
TGGCTGACGG GATGCCTGTC TTGCCTGGAC GCACCACTGC TCTCCTGTCC CTCACCTTGG 240 

15 

CTTTTGCTGT GCCCTGCTCT GGGGTTGAAG CTGGCCCATG TGTCCCCCGG AGTCATGGCT 300 
GCTCCTCCTG GGAGGCCTCT GTGTGCGTCA CGTCTTCCAC ACCTGGGGGC AGCTGGCGAG 360 
20 CCCGTGCTCT GTTCCCCTCG GCTGCTTGGC ACAGAGYTGC AGCCTGGGAY TCTCCGTGGA 420 
CCCAGACTGG GGATTTTGCC AGGGGGGCGA TGGGAGGAGC AG3TGCTTTG CCTGGCGGCT 48C 
GTGTCTGCAT TTCTGGACGC CCCAGAGCAC AGAAGTTGCC GGCACTTTGA GGTCTTCCTC 540 

25 

GGCATGTGCC AGATTACATG AGTGACGGCT GGGAATATGT TTTCTTTTTT GTAATGGAGG 600 
CGTGTTTCAC ATATAGTAAA GCTCACCAAA AAGTAAAAAA AAAAAAAAAA AAAAAACTCG 650 
30 A 661 



35 {2} INFORMATION FOR SEQ ID NO : 80: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH:~1378 base pairs 

(B) TYPE: nucleic acid 
40 (C) STRANDEDNESS : double 

(D) TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 80: 

45 ATTGGGTACC GGGCCCCCCC TCGAAGTTTT TTTTTTTTTT TTTTAATGAA AGCTCTCAAA 60 

TAAGCGATTT TATTCCTATC CATGATTGCA GACATTTACA AAACCATAAC ATCTGAGTTC 12 0 

ACCTTAAAAA ATAACTTATA TAAAGCAGTG AT ATACACAG CACAAAATAG TTCAGGGAGG 180 

50 

GGGCAGGAGC AACTTGTAAT AATTAAAATG TAAACGTGAA AAAAAGGATG GAATAAAAGT 240 

CCCTACTTAT TTCTACTTAA GATGTCATGT GATAATATTT TACAATGTCC TGTGGGTCAA 300 

55 TGTATGTATG TGTATATGTC TGTATAACAT ACACATATAC AGTACATTCT CTTTCCCACA 360 

CATATACATA CACACATAAT TATTTGCAGT TCAGTTTAGG GCAATTCTAA TATGCCACTC 420 

CGTACAGTTG TTTGAATCAC ATTTGGACCC GCTTTCTTCA CAAAAGAGGG GAGAGAGCAG 480 

60 



10 



30 



35 
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GAAATAAAAA GGTTGGTTTG GTGTGACTGA GATTCCTTTG TTTAACTGTA CACTGTGATG 54 0 

AATAATTTTC TTCCGTAG7A GTT^rTGTGAA GGGCTGACTC ACTGTGGTTT TCATGAGGAG 600 

ACTTGGTAAT GGATCACACG CTCATTGTCA TGCTAGGGGA GTAACTCTCA CTCTGAAAAG 660 

GATTTAAGAA ATTTCCCCCC ATTTCGCCAT CATCCCTTGG AGTGCCCGGT TGATTACTCA 720 

GGCTCATATT ATTG3GAGAA TTCTTGGAAA TACTGTCCAT ATCTCCTGAG CCTAAAGAGC 7 80 

CATTCATGTG ATGTGACTC 2 ATTCCTCCTA ATCCACCCAT G3GACCATCT GACCCAGGRC 840 

CCATTGGAAA ATTAGGTCTG TTAGGTCCAG GAGGTACTGC ATTCATTAAA GTATACATGT 900 

15 TATCACCAGA GTTGGTTGAA TCTGCTGGAC TAGGCATGAT GGGTGTT3CT GGTGGCCCTC 960 

CACCTCCTGG AGGACCTACA TAATTCCCAG GAGATGCTGA GGAGTATGGT ATTGAATTGG 1020 

CATTTGTTGG GTTTGGCCAA GGTCTACCAC CACCTGGACC CATGTTCATT CCAGGCATTC 1080 

20 

CAGGGCCAGC TAAAGCATTC AGTGGGGGTG TCATTGCACC TCCATAGTTC TGTGGTCCTA 1140 

AGGGCACCAT TCGTCTTGGA GGAGTCATTC TCTGCATTGG CCCACCCATA TTTGGATGTC 1200 

25 CTTGTTGTCG AGTTGGATCC ATTCCACTGG GGAGTAATGG CTGACTTCCT GGGACACCTG 1260 

CAAGTGCCTG ATTAGGTATC CTCAATGGGG GCCTTGGACC TGCAGGGTAC CGAGGTGACA 1320 

TAAAAGGGTA ATCATGGAAG GCTTTTGCTT CACTTGAGTG TTCAGATGTT TCACGTCT 137 8 



(2) INFORMATION FOR SEQ ID NO : 81: 



(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 1440 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : double 
40 (D) TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 81: 

ACTTTGTCCA AATGTGTCTG TCACATGTAG TCAGCTGNAG NAATTTAAAA TGAATTGCCA 60 

45 

AGTGAAGAGT CTGTGGATTA ATTGGCCGTT AATTAACAGG CTTTATCAAT GTGTCCTCAA 120 

GGGAGAGGCC CAACCCTAAT TAAGGAGCTA AACTTCCTGA GTGAGGGGCT GTGAGGATGG 180 

50 AGGTGGAGGA GGCATCTGGG GCGGGTGGTG GCCGGGCCAG CAGATGGCGC CTCCCTGGCT 240 

GAGCTGCCCG CACCGCCAGT TCCCTCATTT CCACTCAGGA AGGCAGAGAA GGCAGAGTGA 300 

TCTCCTCAAG GAAGAGCTTC CCCAGCCTTC GGGAGCAGCT GGCAGGGCGT CCGGGAATAA 360 

55 

GCCCTACACG CCGCCGCCTG CCTCCAACTC ACTAACCCTG CGCCTCTTGT CTTTCAGATT 420 

CAACGCGTTC AACAGAAGCC ATCCCCAGCC CAGCTTAAAT TATAAAGATA GACAATAACT 480 

60 CTGTTCCAAT CTGCGTGGTG C TT C T TT AGT AAATACTGTA CAGATTTTAC CATGGAGAAC 540 
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TTTTTTTTTA GTTT TT ACCT TTTCTTAATT ACCCTTATTC CGAATGGACG AACACTTTCT 600 

ACCACTGCTG ACCATTGTAA AATACCGTGT ATATAAATCC CATTGAAATA ATGCCCTGGA 6hC 

5 

ATAGAACATC T : IAAATGCTG CTTAATTACA GACTCAGGTC :GATTACTTGT ATTTCATGTA 7 'JO 

ATGTTCCTCC AAGTTAGACA TCTGGTGCAA GACCAACCGG iSAGACCATGG AATTGTCAAA 7 SO 

10 AGTACAAACT GACAGTGTGT ATATTTAATT TAAAGACTTA TTTAAAAACT CACAAGCTCT 840 

CACCTAGACT TTGGAGAGCA GTCTGTTTTC TGTAATGTCT GATACTAGAA ACTAATTTGC 900 

TTATTTTAGT TGTATTCAAG ATTTGAAGAT GTATTTTATA GACAAGTTCT GTTTTTGAAC 960 

15 

TTTGTGGAAC TGTTCCAATC AATCAATTTC GCAGTTATGA TGAGTATTTA CATTATGAAT 1020 

GTATAACCCA GACATGATTT GTAAAGCCGA CAGTATGTTT CTATTACACA ACACTTTTTG 1060 

20 ATACAGCGTC TCTTGTCTTC ACTGATACTG GAGTCTCCGT TGTCTGCNNG GTGCCTTCGA 1140 

GTTTCTAGTT ACAGACAGAA TCATACTGTG ATTTTATTTT TAATATGGAT ATGCTATCAA 1200 

ACTGTGATAC ACTTATAATT CACTGGTCCT GCATCAGGAG ATGGAGTGGG GAAAACTGTA 12 60 

25 

TTT AAT AC AG TTTGTATCTG AATAATCTGT ATGGTTTATA CAGTTTGTGT TGTTCAGAGA 1320 

TGTTTAAAGT TTGATCTTTG TTTTTCTAAA GATTAAAAAA GCACTTGCCC CACTGTAAAT 1380 

30 ATACAGCATG TAAAATTTCT RTAGTATATA AATGGCAGCA AATCACAAAA AAAAAAAAAN 1440 



35 (2) INFORMATION FOR SEQ ID NO: 82: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 1381 base pairs 

(B) TYPE: nucleic acid 
40 (C) STRANDEDNESS : double 

(D) TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 82: 

45 CCCGGGCTGC AGGAATTCGK YACGAGGCCA GCAGTTGCTC CCAGTTCAGG AGGTGCTCCT 60 

GTACCCTGGC CACAGCCCAA TCCTGCCACT GCTGACATCT GGGGAGACTT TACCAAATCT 120 

ACAGGATCAA CTTCCAGCCA GACCCAGCCA GGCACAGGCT GGGTCCAGTT CTGACCTGAG 180 

50 

CACGGTTTTT CCTCATGTGA CTTCTGGGAA GGCGCTCCCT CATCTGGGCC AAAGGAAGGA 240 

GGACGAAGCC CTCCTCAGCT GGCCTGTGTT TGGGGCATGA ATCTCTCCTC TCCTCCTTGT 300 

55 CTGGCTCTGT TGACAAACCG GGCATGTTTG GCAGTAAATT GGCACCGTGT CACACTGTTT 360 

CCTGGGATTC AAGTATGCAA CCAGAACACA GGAGAAGAAA AGCTCCAGGA TCCCTGTCCC 42 0 

CATCTGTCCT CTTGATGTGA GAGAGACTCT GAGACTTCTT CCATCGCAAT GACCTGTATT 480 

60 
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AAACACAAGC CCCCCAAGCA AAAGAAGAGG 
CTTCCCAGGG TCTGCAGGTG TCACATGATC 
5 ACTGGCTGTA GCACTTCAGT CCATCTGCCC 
CAGGTTTAGA GGCTGCAGCT TGAGCTACAA 
TTTAAAAATG TTTAAATATT TTGCTTTGCT 

10 

CAAAAGGAAC TGCTCCCTCG GCGTGCCC 2 A 
GGGCAGCT3C CCTGAGCTTC AGGCAGCA3T 
15 TGTATTCTAG GCCAGGTA3G CAACACAGAG 
AGTGTTGGGG CAGGAAGGTG GATGCTGT VG 
TGCTGGTGGC CCTCTCGGCT CACATGTTCA 

20 

TCTTTGGTGG TTTCTAAAGT GCCTTATCTG 
TGAATGGCTA GAAGAAGGAG CTCAGTAAAC 
25 TTATAAGAAA TCTGAAAGCA CCTCTGACAT 
GATTTCTTCT TTGAAAGGTC AAGACCGTGA 
CCAGATTTTT AAGATAAAAT AAATATTTTT 

30 



337 

TTGASTTTSC TGCCAGGATT CAGATCAGCC 540 

ACAGTTCAGC GGGACrXTTT CCGTACCCAC 60 C 

TCCAGAGGAG GGTTTGTTCC TGATTTTTAG 66 C 

TCAGGAGGGA AATTGGAAGG ATTAGCAGCT "72C 

AATGTGCTGA TCCGCACTAA CTCATCTTTG 73 0 

GCTGGGGCCT CTGAA3GGAT TCCTCACTGT 84 0 

GTTCATCTCT GGCCA3TTGT CTGGTTTCCA 9 00 

CCAAGGC3GG TGCTO3AAG0 CAGACGGAAC 960 

TCATGGA3CT GTGGGAGTT3 GCACTCTGTC 1020 

CAGTGCAGCT CCTGGCAGAC TTGGGTTTTC 1080 

CAAACAACTT CTTTTCTCCT TCAGGAACTG 1140 

TAGAAGTCCA GGGTTGCTT3 GTTTACTGGT 1200 

TCCTTTTATT AACTCACCTC TCAGTTGAAA 1260 

ACTGAAAAAA GTGTTGGCCT TTTTGCGGGA 1320 

ACTTCTGTCA AAAAAAAAAA AAAAAAATNT 1330 

1381 



35 

(2) INFORMATION FOR SEQ ID NO: 83: 

(i) SEQUENCE CHARACTERISTICS : 

(A) LENGTH: 1706 base pairs 
40 (B) TYPE: nucleic acid 

(C) STRANDEDNESS : double 

(D) TOPOLOGY : linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 83: 

45 

ACTGCACCAC TGCCCAGGTC TCCCGGCTGG ATGAAGACGT GGTCCATGAG GAAGCTGGCT 60 

AGCTCAGACT GGAGAGTAGC TTCAGGAAAA AAGACAAGTG GCCTAAGGAA ATCACGGCCC 120 

50 CCAACTATCA TCTGAGGGCT AAAGATGAGA AGTAGATCAC TTAATAAGAC AAAAGCCTGT 180 

AGGGGGAAAA GAAAGGATGT TTAAAAGGAC AGAATGTTTC CCAAGGTAGA AATGACACTG 240 

TCAATTTCTC CTTGGAATGG GGGCAGGGAT ACTCGCCTTG TTGCTCCCAC TTGAGTCAGT 300 

55 

ACTCACCTGC TCCTGGATCT CAGTATCCAC ATCTGAGAGG CAACTCTGGC AGAGTTCACA 360 

GAAGGCCACC ATTCTGTCCC TCAAACTCGA CAGCTGCTTC TGTGGGCACA GTGGCTTGAA 42 0 

60 GGGGAAGAAT GAAGACACAG ACTCCTCTGT TCCCATTATC CCATCTAAGA CCCACACTCA 430 
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CCTGGGGAAG CATCTGATTT AGAAATGTGG GTTAGTGTCC AGAGAATGOA AAAATAGACA 540 

AGAGTCAAGG CTGGCAGGAT AACCTGTAAC AACAAA3GGT TTGAAAAATG AGGTTTQGGT 6CC 

5 

TAGGAGAGGG AGAGACAGAT AGCCAGAAAC ACACCAGTGA AGAGGAGAGA AAATGAGTAA 660 

AGGGAGAGCT AATTCCTTTT CCAGTGGAAA ATGAGTGATA TTCTGGACAT ICTTCAGAGG 720 

10 CATCTACACG AAGTAGAAAT GTCACCGCTT CCTAATTTAC TCTACGTCTT CTAGAATCCC 7 30 

TCAATATTAT CCTTGGCTTC C AGGAAATC T AAGAAGACCC TGGAAGTAGA GTCCACCTTC 840 

TAAGAGAGGA ATGTAAGAGG TGACCCCCAC CCACCTGATC TTCCTCGCTT TGTCCACTCC 900 

15 

ACGCACTGAG ACTTGACACA CCTAGTGGCC ACCTAGAACG TAGGTCCTTA AAATYTAGCC 960 

CCCCAGCCCC CAACCCATCT CTAGCCTGTC CACTCACCTG GTGAGGAACY TYTCCTGTGT 1020 

20 CCACAGCYTT CTGCAGGAGT TOGCA^CATG GCTCATAGAG CTCCCAGCGA GTCAGGTCAT 1080 

GAGTGCTTTG GGGGAGAAAG GGGAATGTTA TACTGGAAAA GAACAGAGGG AACCAACTCC 1140 

ACAGACACCA GTAAAAACGG GATGGGGAAG AGGAGGAAAG CCACTCACTT GTAGAAGGCA 1200 

25 

GAGAGGCGTT TCAGAGTGGC TGCCAGATTA TATACCTCAT CCTCATCTAG GAAGGACGAC 1260 

TGAGAAGGAA AGAAGATCCA CAATAGCATT TCCCCCAGAA CTCATCAGTC <GACATCCCCC 1320 

30 GTCTTGCAGC CCCTCCCACC CTTGTTTGGG GTGTCCCATT GTCCAGCCCC AGCTCCTACC 1380 

TGTAACAGCT CTTCAAGCTC CTGCTGGAAR CGGTCAGTCA GCAAATCTAC TAGCTGGCTG 1440 

CGGGCAAAGT CCGCCCGGCT GAAGAAAGTG AATTCGGGAT TACAGAGCAG GTAAGAGCAT 1500 

35 

GCGCCCCAGC CTCAAGCACC GCTGGCTCTG CATGCTTCAC CACCACCTCC TGGAGTTGCT 1560 

GCAGGAACAG CTCCAGGTGC TGAGAAGAAA AGGCAGAAGA TGGTGTGCTG TGGGGATGGG 1620 

40 AGGAGGACAC TCTTCTGGCG GGAAGTGGAA CGGGGTTAAA AGCATTAAAC TTCAAGGATA 1680 

AGATGCCTAA RAAAAAAAAA AAAAAA 1706 



45 

(2) INFORMATION FOR SEQ ID NO : 84: 

(i) SEQUENCE CHARACTERISTICS: 
50 (A) LENGTH: 573 base pairs 

(3) TYPE: nucleic acid 

(C) STRANDEDNESS : double 

(D) TOPOLOGY: linear 

55 (xi) SEQUENCE DESCRIPTION : SEQ ID NO: 84: 

GAATTCGGCA CGAGCTTGGT AGCCTTAGAA CTGCATGAGC TGCTTTACCA CTGGGAAACA 60 
CGAGCACAGC CTAGCTTGAT TTTGTATGTG GTATCAGATC TAAGGTGGAT GGAATTCAGG 120 

60 
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ACTTCCTGTC TACTCTTTGA TTTTGTTTTA ITiTi'AGAAA TGTTTTATTT 


TC-TTTTATTC 


130 




A7TTATTCAT CTTCAGAGAC ATGGTCTGGC TCTGTTGCCC AGGATGGAGT 


GCATGGTGTG 


240 


5 


ATCATAGGC: ACTGCAGTGT TGAGCTCCC3 GGC7CAGGCG ATCCTCCTGC 


CTCAGCTYCC 


30C 




TTAGTAGCTG GGACTATA3G CACATGCCCT ACCATGCCTG GCTTTGTCTA 


CT " ITTT GAAT 


-5DU 


10 


GATGTCYCAA ACTAGAAGGT CTATTAATTT AAAAAATTAA GGATAGCATG 
AAATAATAAC AGTGGGAAAA GGGACCTTCC AATGATTCAG ACATCAACTT 


CCATAATTAA 
GTGATTTAAA 


420 
430 




AAAACGAAAA ATAAATAATA GGAAAAAAAG GGGAAAAAGT TAAATAAAAA TAAAATTAAA 


540 


15 


AAAAAAAAAA AAAAACTCGA GGGGGGCCCG GTA 




573 


20 


{2) I NFOPMAT ION FOR SEQ ID NO : 85: 






25 


(1) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 6 84 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : double 

(D) TOPOLOGY: linear 

(XI) SEQUENCE DESCRIPTION: SEQ ID NO: 85: 






30 


CTCTTTGGCT GTGTCTACCT CCTTCATCTG CTGCGCCGAC AT AAGC AC CG 


CCCTGCCCCT 


60 




AGGCTCCAGC CGTCCCGCAC CAGCCCCCAG GCACCGAGAG CACGAGCATG 


GGCACCAAGC 


120 


35 


CAGGCCTCCC AGGCTGCTCT YCACGTCCCT TATGC CACT A TCAACACCAG 
GCTACTTTGG ACACAGCTCA CCCCCATGGG GGGCCGTCCT GGTGGGCGTC 


CTGCYGCCCA 
ACTCCCCACC 


180 
240 




CACGCTGCAC ACCGGCCCCA GGGCCCTGCC GCCTGGGCCT CCACACCCAT 


CCCTGCACGT 


300 


40 


GGCAGCTTTG TCTCTGTTGA GAATGGACTC TACGCTCAGG CAGGGGAGAR 


GCCTCCTCAC 


360 




ACTGGTCCCG GCCTCACTCT TTTCCCTGAC CCTCGGGGGC CCAGGGCCAT 


GGAAGGACCC 


420 


45 


TTAGGAGTTC GATGAGAGAG ACCATGAGGC CACTGGGCTT TCCCCCTCCC 
GGTGTCATCC CCTTACTTTA ATTCTTGGGC CTCCAATAAG TGTCCCATAG 


AGGCCTCCTG 
GTGTCTGGCC 


480 
540 
600 




AGGCCCACCT GCTGCGGATG TGGTCTGTGT GCGTGTGTGG GCACAGGTGT 


GAGTGTGTGA 




GTGACAGTTA CCCCATTTCA GTCATTTCCT GCTGCAACTA AGTCAGCAAC 
TGAAAAAAAA AAAAAAAAAA AAAC 


ACAGTTTCTC 


660 
684 


55 


(2) INFORMATION FOR SEQ ID NO: 86: 






60 


(i) SEQUENCE CHARACTERISTICS : 

(A) LENGTH: 1036 base pairs 
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340 



(3' TYPE: nucleic acid 
[■£) S7RANDEDNE3S : double 
(D) TOPOLOGY: linear 

5 Jxii SEQUENCE DESCRIPTION: SEQ ID NO: 86: 

TGGAGGCAGA TGCACAGGAG AAAGGTTCCC GTCCGCACCC TCTCAGACCT GAGGCTOAGC 50 

TTGCAGTGAG GGCTTCTCCT CGGCCCCTCG CCCGCCCCCA GAGCTGCCAT CCCTGCTGTT 120 

10 

ACAAGCCAGA GGAGCCCGGA TGTGAGGCCC CAGATCACCT CCAGGGACTT GGGGTTCCCA 180 

TCTGAAATCC TTTATTTTTG TACCATGGGG TGGGCCCCGG GCTGAGAAGG AAGAAGCACC 240 

15 CTCTCCCCGG CCTCCTCTGT CTGCACCCGT GGGGCTGTGA CTTACTCCTG CCTCCAGGGG 300 

CGGGGCGGGG CCCCCTGGGA CCTCTTAA.GG CCCAAGGTGG GCCCCAGGAC CTYT3GGCAG 360 

AGTGGAYTGC TCATGGCAGA TGTGTGGCAA TGTCTGGCTG WGTCTTTCCG GCAMCTGCGT 420 

20 

YCCCTYTCCC GGGYTCCCCT G~TGCATGGT GGATGTGCTC CTTCCTGGCC CGGTCACATT 480 

GCCTCCTTGA GCCTTAGTCC AGGGGGTCAC TYCTCCCACC CCACCTACCT CACAGGGTTG 540 

25 TTGTGAG3GT GCACAGAGGA GCAAAGTCCC TGAAGGCCCT CAGGCAGTAT ATAGGGGCCG 600 

CCCACCTTCA GCTGCCCTGG GATGGGAAGG ACCCAGCCCG ACCCCTGGGC ATAACACTGT 660 

GTTTG^AAAT GGAGATTCAG GTATTGGGGA TGCAGGTTGT GGGGAGCTOG CCTGGCAGAG 720 

30 

TAGGGGTAGT TGGCTTGGCC TTCTCTTTGG TGATCCCACC CCCAGCCATT TGCATTGCTG 780 

GCCCAGCGCC TGGCCTGGGG GGCGGGGAGA GGCAGCAGAA GGGGCTGGGC AGGGGCGGTG 840 

35 GAGGACTCAG GAACTGCCCG GGGAGAGTGG GTATGGCGGC TGAGCCAGGG GCCCTCCTGT 900 

GTTTGACTTC CCGGGATGGG TCCTTGCTTC TCAGCTGTGT CCGACCCCAC CATGTAATAA 960 

AACCCAAAGG AACAGCAAAA AAAAAAAAAA AAAAAAAAAA AAAAAAAAAA AAAAAAAAAN 1020 

40 

CCCNGGGGGG GNCCCG 1036 



45 

(2) INFORMATION FOR SEQ ID NO: 87: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 908 base pairs 
50 <B) TYPE: nucleic acid 

(C) STRANDEDNESS: double 

(D) TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 87: 

55 

TTAAACAAAT GGAATCATGC AATATGTGAC CTTTTGCGTC TGGCTTATTT TATTTAGCAT 60 
AATGTTTTTG AGGTTCATCC AAGCTGTAGC ATGTATCAGC ACCTCATTTC TTTTTCTGGC 120 
60 TGAATATTAT TCCATTATAT GGATTTACCA CAATTCATTT ACCTATTCAT CTTTTGTTTC 180 
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TGCTGTCTGG CTATTG7GAA TAATGCTTCG ATAAACATTC AT AT AC AAGT TTCTATGTGG 24 0 

C7TTATGTTT TCATTTCTCT TGGCTATCTA CATGGGAGTA GAATTCTAGG TCATAATATA 30 0 

5 

ATTTTATGTT TAACTTCTCA AAGAATTGCC AAAAGGTTTT TC ATAGTG* 3 '2 TGCATCATTT 360 

ACATTCCCAC CGGCAATGTA CAAGGATTTC TATTTTTCCA TATCCTTOZA CTTACCAACA 42 0 

10 CTTCTTTTTK GT.VATWATTT TGTTTTTTCA TTATTGCCAC CCTAGTGG VT GTGAAATGGC 480 

ATCTTATTGT TTT GATTTGC ATTTCTCTAA TGACAAATGA TATCATACTT TTTTTATGTG 54 0 

CTTACGGATC AAA;3GTATTT CCTTGGAGAA ATGTCCCTTC AAGTCCTTTG CCATTTCAAA 6CC 

15 

ATTTGGTTAT TTGTCTTTTA TTATTCAGTT TTAAGAAATT CTGGCCAGGC GCAGTGGCTC 660 

ACCTGTAATC MTAGCACTTT 'GGGAGGCCAA GGCGGGCAGA TCACTTGA 3K TCAGGACTTC 72 0 

20 GAGACCAGCC TGGCCAACAT GGTGAAAC C C CATCTTACTA AAAATACAAA AATTAGCTGG 7 SO 

GCGTGGTGGC AGGTGCATGT AATCNTATCT ACTCAGGAGG CTGAGGCA3G AGAATCGCTT 840 

GAACCCAGGA GGCGGAGGCT GCAGTGAGCC AAGATCACGC CATTGCACTC TAGCCTGGGT 900 

25 

GACACAGA 908 



30 

(2) INFORMATION FOR SEQ ID NO: 88: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 655 base pairs 
35 (B) TYPE: nucleic acid 

(C) STRANDEDNESS : double 

(D) TOPOLOGY : linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 88: 

40 

TGCACTGGTT CCTTCTCCCC AGCAAATACT C^CTTCTTGT TTTTCTCTGA TGTGGCAGGT 60 

GACTACAAAA TCCGCCTTGG TATTCTTCAA ATGCATATAT ATTCCTTTCT TGTCAGCTCC 120 

45 CTCTCTTCCT AGATTAGAAA ACTGCCTCAT TTTCTGCTCA CTGGATGTGC AGTCCCAGCT 180 

TGTCTTCCTC TCCTCCCCCC CTGTTGCAGG TGTTCTTTTT TTTTTTCTTC TCTCCCCACT 240 

GGGCAGCAAA AGTTGTTCCA CAGTGGAAAW TTAGGCATCC TCAAGTTTCY TCCCAGCTTC 300 

50 

TGCTGTGTTT TCTTAGAGTA AATTGCCAAT TTCTGTTTTT ACAGGAAATC CTTTTTTAAA 360 

AATGGAATCA GTGTGGTCCC CATCTACTCT GCAAAAATTG CATTTTTCTC TATTTTCAAA 420 

55 TGAGATTTGT TCAAGTTTCA AAACCACGTG AAATAATAAA TGTATAGTAG TTTTCTTTTC 480 

CTTGGGCATT GCTWGATATG TGAAATGGGT TTATGAAAAA TAATAAAATC ATAACGCTAT 54 0 

TTGTTTGACT TTCAATTTCA TGGGAATTTT TCTCAGCTAA ACTCTAAATG GTGATTAPGC 600 

60 
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AAAAAAAAAA AAAAAAAACY GRAGGGGGGC CCGGTACCAA TTCGCCCTAT AATGA 



5 

(2) INFORMATION FOR SEQ 10 NO: 89: 

(i) SEQUENCE CHARACTERISTICS : 

(A) LENGTH: 1102 base pairs 
10 (3) TYPE: nucleic acid 

(C) STRANDEDNESS : double 
{ D ) TOPOLOGY : 1 inear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 89: 

15 

TTTTTTTTTT ACCATTTAAA ATAAAATGAA AGTGACCTTC TGTTTATAAA AATCTTTGTC 60 

TGCATCTCTG CTTATTTCCT TAGAAGAGAT TCCAAGAAGC GGTGAGTGAT TTCACGGCAG 120 

20 CAGAGGGTTG GGACATATTA CGGGCGCGGA TCCCTCTTGG AGTGAGATGA CTCTCCGGAG 180 

AGATTTAGTC GTCACCCTCG CGTGTGAGGC TGCGTCACAC CCCAGGGATG TGTCTATCAA 240 

GATGGAAGAT CTTTTACACG CTCTTGATTT TGTTTGSCTY TTTTTCTATT ACTAGTGAGA 3 00 

25 

AKGAAACTTT TTATATGATT ATTATCCATC ATAATCCAAC ACAAATTACT GCTTCATGTT 360 

CTTTTACTTT CCTGTGAAGG TTTTAGTGCC TTTTAAAAAT TGCTATATAT TAAGCTTGTT 420 

30 AATACTTCCA TGCTGTATTT GTGGSCATCA RTTTCCCCGG GNACAGGCNT GCACATTTTG 48 0 

CCTTCACACG CTGGGTGGTT TTTCATTTTC AMTTCTATTT CTCGTTCTTC TATCGTTTTA 540 

TGTTCAGACG GGTTTCTCCG TGTAGAAAGC AGTTTATGAA GATTTACTTT CGACAGTCTT 600 

35 

CTCTCTACTT TCTACAGTGA ATTCTCTGAT GTGTCTGGGA GTTTGGGGGT CTGGGTAAGA 660 

RTCCTCCTCT CACCCTATTC TCTATTACGA TCCACAGCCT CATGCTTTAT GARATTGGTG 720 

40 GCCGGGARCG GGGGAGATTT GCGGATCCCC CAAGCCAGAC TTTATCCCCC TATCCCTGCC 780 

TCTGGATCCC ACGTACAGGC CTGGGAACTC CCTGTGGGTA GGGGCCAATG GTCTCGCACT 840 

CTCACCTGTA CCCCAGGGCT GGCACAGGAT GGTCAAGGAG AGAGGCTGCC CAAGCGCATC 900 

45 

CYTCTGGTGT CCCCCTGACA CGCCTCCAAA GTGAGCAGGT AGGTTTCAAC AGCCCCACGT 960 

TGCAGGTGGG AGATGAAGCT CAGGGTGGAG ACCAGTATCT CACAGTTCTC TTTGCATGGC 1020 

50 CGGGTACTTG TTAGTCAACT GATCAAGTGA AAATTCTAGC CCCAGAGGCA GGAGAATCCG 1080 

GAACAAAATT AAACCAGCCA GG 1102 



55 

(2) INFORMATION FOR SEQ ID NO: 90: 

( i ) SEQUENCE CHARACTERISTICS: 
60 (A) LENGTH: 1533 base pairs 
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(5) TYPE: nuclei- acid 

(C) STRANDS CICESS : double 

(D) TOPOLOGY: linear 

5 (xi) SEQUENCE DESCFIPTION: 3EQ ID ICG: 90: 

GGCAC3AGCC GNCAC 3GGC A GCGTZCCATA GCGCCAGGGA 3CCCCTGGCA GCGGGAGCC3 6 0 
CGGGTCGAGG TTATGGATGC AGC 33GCGGC CCC:GGGGCG rGCTCCCGCG GCCCTGCCGG • 12 3 

10 

TGNCTGGTGC TGrTGAACCC GCG-Z G-3C 3GC AAG3GCAAGG 2CTTGCAGCT CTTCCGGAGT 18 3 

CACGTGCAGC CC3TTTTGGC TGA333TGAA ATCTCCTTCA CGCTGATGCT CACTGAGCGG 240 

15 CGGAACCACG CGCGGGARCT GGT 303GTCG GAGGAGCT'GG 3CCGCTGGRA C3CTCTGGTG 300 

GTCATGTYTG GAGACGGGCT GAT3CACGAG GTG3TGAACG 3GCTTCATGG AGCGGCCTGA 360 

3TGGGAGACC GCCATCCAGA AGC : ICTGTG TAGCCTCCCA 3CAGGCTCTG G3AACGCSCT 42 0 

20 

GGCAGCTTCC TTRAACCATT ATCCTGGCTA TRAGCAGGTC ACCAATGAAG ACCTCCTGA3 48 0 

CAACTGCACG CTATTGCTGT GCC 3CCGGCT GCTGTCACCC ATGAACCTGC TGTCTCTGCA 54 0 

25 CACGGCTTCG GG3CTGCGCC TCTT 3TCTGT GCT'ZAGCCTG GC CTGGGGCT TCATTGCTGA 600 

TGTGGACCTA GAGAGTGAGA AGTATCGGCG TCTGGGGGAG ATGCGCTTCA CTCTGGGC A 2 660 

CTTCCTGCGT CTGGCAGCCC TGCGCACCTA CCGCGGCCGA CTGGCCTACC TCCCTGTAGG 72 0 

30 

AAGAGTGGGT TCCAAGACAC CTGCCTCCCC CGTTGTGGTC CAGCAGGGCC C3GTAGATGO 780 

ACACCTTGTG CCACTGGA<3G AGCCAGTGCC CTCTCACTGG ACAGTGGTGC CCGACGAGGA 840 

35 CT/ITGTGCTA GTCCTGGCAC TGCT3CACTC GCACCTGGGC AGTGAGATGT TTGCTGCACC 900 

CATGGGCCGC TGTGCAGCTG GCGTCATGCA TCTGTTCTAC GTGCGGGCG3 GAGTGTCTCG 960 

TGCCATGCTG CTGCGCCTCT TCCTGGCCAT GGAGAAGGGC AGGCATATGG AGTATGAATG 1020 

40 

CCCCTACTTG GTATATGTGC CCGTGGTCGC CTTCCGCTTG GAGCCCAAGG ATGGGAAAGG 1080 

TCTGTTTGCA GTGGATGGGG AATTGATGGT TAGCGAGGCC GTGCAGGGCC AGGTGCACCC 1140 

45 AAACTACTTC TGGATGGTCA GCGGTTGCGT GGAGCCCCCG CCCAGCTGGA AGCCCCAGCA 1200 

GATGCCACCG CCAGAAGAGC CCTTATGACC CCTGGGCCGC GCTGTGCCTT AGTGTCTACT 1260 

TGCAGGACCC TTCCTCCTTC CCTAGGGCTG CAGGGCCTGT CCACAGCTCC TGTGGGGGTG 1320 

50 

GAGGAGACTC CTCTGGAGAA GGGTGAGAAG GTGGAGGCTA TGCTTTGGGG GGACAGGCCA 1380 

GAATGAAGTC CTGGGTCAGG AGCCCAGCTG GCTGGGCCCA GCTGCCTATG TAAGGCCTTC 144 0 

55 TAGTTTGTTC TGAGACCCCC ACCCCACGAA CCAAATCCAA ATAAAGTGAC ATTCCCAAAA 1500 

AAAAAAAAAA AAAAAAAAAA ANCCCGNGGG GGG 153 3 



60 
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(2 J INFORMATION FOR SEQ ID NO: 91: 

(i) SEQUENCE CHARACTERISTICS: 
5 (A) LENGTH: 575 base pairs 

(3) TYPE: nucleic acid 

(C) STRANDEDNESS : double 

(D) TOPOLOGY: linear 

10 (Xi) SEQUENCE DESCRIPTION : SEQ ID NO: 91: 

ATCCTCTGGA ATCTAGGTGG AAGCCACCAA GCCTTCTTCA CACTTGCGTT CTC-AGCATCT 60 

GCAGACTTAA CCCCATGTGG CAATCACCAA GGCTTATGGC TTGTGTCCTC CAGAACTGTG 12 C 

15 

GCCAGAGCTG TACCTGGGCC CCTTTGAGCT GAGGCTGAAG CCAGAGTCTG AAGCTCAGCA 180 

GGGC AGT ARG GCCCTGGGCC TGGCCCCTGA AACCATTCTT TTCTCCTAAG CCTCT3GGCC 240 

20 TTTGATGGGA RGGGCTGTCC TCAAGATTTT TGAAATGCCT TTGGAGGGTT TTTGCCTTGT 3CC 

CTTGGATATT GGCTTCCTTT TAGTTATGCT CATCTCTCTA GCAAGTGAAT GTTTCACAAC 360 

CTGCTTGGAT TCTTTCTCTA CCACAGARCC AGGCTGCAAA TTTTACAAAC TTTTACACTC 42 0 

25 

TGTTTCCCTT TTAAATATAA ATTTCAATGT TAAGTCACTT CTTTGCTCCC ATATCTGATT 480 

TAGGTTGCTG GAAGTAGCCA AGTCACCTCT TGAATGCTTT GCTGCTTAGA AATTTCCTCT 540 

30 ACTAGGTAGC CTGGGTCATC ACACTTAAGT TCAAA 57 5 



35 (2) INFORMATION FOR SEQ ID NO: 92: 

(i) SEQUENCE CHARACTERISTICS : 

(A) LENGTH: 639 base pairs 

(B) TYPE: nucleic acid 
40 (C) STRANDEDNESS : double 

{D) TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 92: 

45 TCCTTTCATC TTAAGCACCA CCCGACAGGG CAGGTACTAT TACCATCTCC GTTTGACAGA 60 

TNAGGAACCT GGCACAGGAA GCATTTAAGT GGATTCCCCA GGATCGCCCC ACTGTCAGGA 120 

GCAGANTCAG AATGGGCCTC AGCATCAGGC TCCCAATCCT GGCTTCTAAC TGCTGCGCTC 180 

50 

TGCCCTTCYC TCWCCCCACC TCCCCACTCC AGTGCCTTTG GTCATGCCAC TGCAGCTTTC 2 40 

AGGCCAATAC TGGATTAGCC TCTTAGTGTT CTTGTCCCTG CAGCCATTTC CCCAGGCAGC 300 

55 AATTCCATGT GCCCTCACTG ATGTAGGTGG CTCTTGTGTC ATTTGTCACA TCCTATTGAA 360 

TTGTTTATGC ATCTTGTTCA CACTCACAGC ACCCTCCCTC TCACACGTCC TCCTTATAAA 420 

AATGTCCC'TC AGTGTCTGCT ATGAGCCAGG TGCAGACTTA AGTGACAGGG CTGCTACGGG 48C 

60 
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AAATAAAAAA 7TAACAAGGA GCACCTGCCT CTTAATGCAC AGTAA^AAAC TATGTTAAGT 540 
GTCAGGAAGG AAAGGTTAAG GATOCCAGGA AGGCTTTTAA TAAATAACCT GLACTTAGATG 6 CO 

5 GGCAGGTGGT GCTGARGATT AAGAACGTGT TCTTCTCGA 63 3 



10 (2) INFORMATION FOR SEQ ID NO: 93: 

(i> SEQUENCE CHARACTERISTICS: 

(A) LENGTH : 744 base pairs 
(3) TYPE: nucleic acid 
15 (C) STRANDEDNESS : double 

(D) TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 93: 

20 G AATTC GGC A CGAGAGTGGC TGGAGTCTGG CTGCAGAGGG AAGACATCAG CAGGGAGGGA 60 

GCCAGGGCCT GTCACATCTT TCCTCTGGCC ATTGTCCTGG TCTTTGTAAG CCCAGAATCT 120 

ccccrrcccT GAAGGGAGGC CAGCACCCCA GGAGGGCAGC AGGTGTGCTG TGAGGGTTGG 180 

25 

AGTAGTGTGA GAGGTCAGGG TACACTAGAA TGGCCATGGA CACCATGTGG GGGTGCTCTG 240 

GGCTGGGCCA CAGAACAGTG TCCTTCCTGC TGCTCCTCCC CTGCAGCTTC CCCCGACCTT 300 

30 GTNGTTTATT TGGTTTGATA CCAATCAGCA GACCCTGCAA GGTGGAAGCT CCCAGGCTCT 360 

CAGTCCCACS ACTCTCATGT GCCAGTCACC CNTACTGTAA CTGCCCAATG AGTACTTCTT 420 

GCCCACTGCC AAGATAGAGC CAGTTTACCA AGACAGGGGA ATTGCAGTAG AGAAAGAGTT 480 

35 

GAATATACAT AGAGCCAGCT AAATGGGAGA GTGGAGTTTT CTTATTACTT AAATCAGCCT 540 

CCCYTAAAAT TCAGAGGTGA GAATTTTTCA AGGACAGTTT GGTGGSCAGG CCTAGGGAAT 600 

40 GGATGCTGCT GATTGGCTAG GGATGCAATC ATAGGGGTGT AGAAAAGTWC CTTGTGCACT 660 

GAGTCCACTT TTGGTGAGAG CTACCAAGGA GCTGCTGGTC TGCTGGTCCC GGTAGAGCCA 720 

TCTGGTGTCA GGAATGCAAA AGTG 744 

45 



50 



(2) INFORMATION FOR SEQ ID NO: 94: 



(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 526 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : double 
55 (D) TOPOLOGY: linear 

(xi ) SEQUENCE DESCRIPTION : SEQ ID NO: 94: 



GCAGGGGAAT TCGGCCACGG AGGGGTTTCA ACAGGGCCCG TGGGGTGAGG TGCAFACACA 

60 



6C 
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10 



346 



AAGCCCA7AA GTGCTGGCCT GTTGGGACAA ATGAGAGAAA TCGCATAGGG TGGTGATGAC 120 

AGCGCAYTCA GCCATCYTAY TCC7GGGGAA AATGAAACTT GTGCTCCTAT CAAATGCTCA 13 0 

GTTGTAAAAC TG3AAAAAAA TTTTAGAAGA CATCTTGTCC AGCATCTGTG TT7ATGTCTA 24 0 

TAAAATGTAG AAAACTAAAG CACAGAGATG TTAAATGTTT TGTCCAA3GT CCAACAGCTG 30 0 

GTTAGCARGC TTCGTCTGGT GACCTTTCTA CTGAACCACA GTGCCGCTGG GGGAAGTCCT 360 

CAGCACAGAT GGCTGCTGCT ATAGCTGGGG TATGGGCAGT ATTAGTAGTT AACCA GTCAA 42 0 

CCCAAGTTCC GATAGTCTAG GTTCTGCTTC AGCTGGAGGT TAGGGAAAAA CACAAGAAAA 480 

15 TCCCTTACCA GTCTACCAGT GCTGGGGGAT GTACTAAGAG ATCCCC 52 6 



20 (2) INFORMATION FOR SEQ ID NO: 95: 

<i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 42 6 base pairs 

(B) TYPE: nucleic acid 
25 (C) STRANDEDNESS : double 

(D) TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 95: 

30 GGCACAGGGC AGGAGAGACT TGGTCCATGG GGA3AAGCCT GCAGTATAGA TGGGACCTCC 6 0 

AGGAGCCCAA GTAGCATAGA CCCTGCTGAT CCGGGGCCAT TGAGCCAGAG GATTTGGGCT 12 0 

GAATGTCCCC AGAGACAAAA GGGAAAGGTA GATCCTTTCC CTTAAAGATG AAAGCCATCG 180 

35 

CCCGGGCTTG CTTATTGCTC TCTCTCCTGG TCCTTCCACA TGTTGTTTCT GAACATTTGT 240 

TCTGGCATCA CAATCCCCGT CATCCTGTCA TCTGGCCCTT CCCACCTTTC CACCTTATCT 300 

40 CTTGCAGTGT CTCCGCGTCG ACCTGGCACC TGGGTGAARG CTTGCTCTTG CTGGTGCCCA 360 

TAGCCCCCAG TGTATGGTCT TGAMCTCCCC AGCCATATGG ARACCCACCT CAGGAGGGCC 420 

CCTCGA 426 



45 
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(2) INFORMATION FOR SEQ ID NO: 96: 



(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 844 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : double 
55 (D) TOPOLOGY: linear 



60 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 96: 
GGCACAGCGG CACGAGATAG GAAGCTTGGC AGGGGCAGCT CCCCCASTGC GCATTGCCCT 60 
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34*? 

GTAACTCGAG CC-CCTGGGAG TGG3GAGAGG CTTGGAAATG GAGCAGGGTG GTGGACCTCG 120 

Tm-rrccTG ctcatcccag gcctcctcca taa :acctac ctagcac:-gc ct'^gggactt ibo 

5 CICAGCCCAA GGAACAACTG AGAATACTGA GTGCCAGGGT AGCCCTAGCC CCATTTCACA 240 

C:TG3GCAAA 3TGAGGTCAC TGGATTCAAA CACTCAGATT TAAAC ZTCCT -rTGTGTCTO: 300 

A3CACCTGTA TATAACTOCC AGCCTCTGCT GCCXTCTCC AAAAA3TCTC TGCCCTTGTC 36C 

10 

TTTGOCACCT GTCTCTGTCC TCCCCATTCT CTGCTCCTCC TTTCTrCAAC TCAGANTCAC 42 0 

CCTGTTAGTT rAGCAAATGT TCATCGAGCT CCArAATGTA GCAGGACA GG MCTGTCTAAC 480 

15 AGATTCTGGN CTTGCAAGGG TGAGACAAGT ACTCTCCATC TTTCT:TCAT CTTCACAGAT 540 

GGTCTGCTCA ACAACTTTGC ACF3AATTGT AAATAATTGA TACTGC AT AA AACATT3ATG 600 

TTCTTTAAGG GTAGTCCAGC AAGGTGGCAA GTCTTATAAT GATAACTGCT CAAGGATCTC 660 

20 

TCAGTGAAGC ATTTGGGGST GCTAGCTCTG CCTATGGGTG AGGTCAGCTA TCTCACGCCA 720 

TCTACTTCCA CT7TGCCCCCC CATGCCAGGC TCACCCTGAG CTGAGATGCC TGAGCAGGTG 780 

25 GCAGAAAGGA GCCAC CTGGT TTATGCTTCG GGACCACAAA CTCCTCTATC CAGANGACAG 840 

TTTT 844 

30 

(2) INFORMATION FOR SEQ ID NO: 97: 

(i) SEQUENCE CHARACTERISTICS: 
35 (A) LENGTH: 1985 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEBNESS : double 

(D) TOPOLOGY: linear 

40 (xi) SEQUENCE DESCRIPTION : SEQ ID NO: 97: 

AGCCCTGCTG AAGTACAGGT TCTTCTATCA GTTTCTGTTG GGCAATGAAC GAGCAACAGC 60 

AAAGGAGATC AGGGATGAAT ATGTGGAGAC GCTGAGCAAG ATTTACCTGT CTTACTACCG 12 0 

45 

CTCTTACCTG GGGCGGCTCA TGAAGGTGCA GTATGAGGAA GTCGCTGAGA AAGATGATCT 180 

AATGGGTGTG GAAGATACAG CAAAGAAAGG ATTCTYCTCA AAGCCATC3C TCCGCAGCAG 240 

50 GAACACCATT TTCACCCTAG GAACCCGCGG CTCTGTCATC TCCCCCACTG AACTTGAGGC 300 

CCCCATCCTG GTGCCTCACA CAGCGCAGCG GNAGAGCAGA GGTATCCATT TGAGGCCCTC 3 60 

TTCCGCAGCC AGCACTACGS CCTCCTAGAC AATTCCTGCC GCGAATACGT TTTCATCTGT 420 

55 

GAATTTTTTG TTGTGTCTGG CCCAGYTGCA CACGACCTGT TCCATGCTGT CATGGGCCGT 480 

ACACTCAGCA TGACCCTGAA ACACCTGGAT TCTTATCTAG CTGACTGCTA CGATGCCATT 540 

60 GCTGTTTTTC TCTGTATCCA CATTGTTCTC CGGTTCCGTA ACATTGCAGC AAAGAGGGAT 600 
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GTTCC7GCCC TGGACAGGTA CT3GG3AACA GGTGCTTOCC TTGCTATG3C CACGGTTTGA 650 

ACTGATCCTG GAGATGAATG TTCAGAGCGT CCGAAGCACT GACCC3CAGC GCCTAGGGGG 720 

5 

GTTGGATACT CGGCCCCACT ATATCACACG CCGCTATGCA GAGTTCTC 2T CCGCTCTTGT 73 0 

CAGTATCAAC CAGACAATTC CTAATGAACG GACCATGCAA TTGCTGGGAC AGCTGCAGGT 8 40 

10 GGAGGTGGAG AATTTTGTCC TCCSAGTGGC AGCTGAGTTC TCCTCAAG3A AG5AGCAGCT 900 

TGTGTTTCTG ATCAACAACT ATGACATGAT GCTGGGTGTG CTGATGGAGtT QGGCTGCAGA 960 

TGACAGCAAA GAGGTTGAGA GCTTCCAGCA GCTGCTCAAT GCTCGGA2AC AG3AATTCAT 102 0 

15 

TGAAGAGTTG CTGTCTCCCC CTTTTGGGGG TTTAGTGGCA TTTGTGAAG3 AG3CTGAGGC 1080 

TTTGATTGAG CGTGGACAGG CTGAGCGACT TCGAGGGGAA GAAGCCCGGG TAACTCAGCT 1140 

20 GATCCGTGGC TTTGGTAGTT CCTGGAAATC ATCAG'TGGAA TCTCTGAGTC AG3ATGTAAT 1200 

GCGGAGTrrC ACCAACTTCA GAAATGGCAC CAGTATCATT CAGGGAGC3C TGACCCAGCT 1260 

GATCCAGCTC TATCATCGCT TCCACCGGGT GCTGTCCCAG CCGCAGCTCC GAi3CCCTCCC 1320 

25 

TGCCCGGG2T GAGCTCATCA ACATTCACCA CCTTATGGTG GAGCTCAAGA AGCATAAGCC 1380 

CAACTTCTGA TGTGCCAGAA ACCGCCCTGA GATCTGCCGG TCATCTCCAT GGACTTCTGC 1440 

30 ACCCCATTCC ATACCCTTCT TCACCTGGGG TACCCCTTCC AGTTTTCCCC TTGCTTCCCA 1500 

GGCCCTTGAC ATGGCTTACC TGCCTTCACT CCCAGCACCT TGCCCAACAG GATAAGCTGG 1560 

ATCCCCTTGG CCTTCTGAAT ATCCCAGTGT CTTCAGGTTT CCCAAGACCA CTTCCCTGTG 1620 

35 

GGCTTCCAAA ATGGCCTTTA TCATTTCTCC AGTCTGTCAC CCTCCTTTCC TGCTCCCATA 1680 

CACCCAAGGC TTGTTTCTTC CCCTGTAAAA ACCACTGCCT CAATCTCTCG TTCACTCAAC 1740 

40 TAGTCACCAT GTCCTGAGGC ATGAAGCCTC CTCAGCTCTT GGAATTGCTG GC AAGGGGTG 1800 

ACTGCCTCTG AGTCATTGTG TTTTTCAAAG TGATTTCTTT TCTGTAGCTT TTTGACCTAA 1860 

GATCTCAGCA ATTTGAACAC TAACCTCTCC CCTCCTGGCT CAAGAATTAC TCCGAAGTCA 1920 

45 

GTCTGCAGAA AATAAATATT TAGTATGACA TGAAAAAAAA AAAAAAAAAA AAAAAAAAAA 1980 

AAAAA 1985 

50 



(2) INFORMATION FOR SEQ ID NO: 98: 

55 (i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 1416 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : double 

(D) TOPOLOGY: linear 

60 
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ixii SEQUENCE DESCRIPTION' : SEQ ID :iO: 93. 

ATATGAAGGG AAA3AATTTG ATTATGTTTT CTCAA1TGAT GTCAATGAAG GTGGAC CATC 60 

ATATAAATTG C CAT AT AAT A CCAGTGATGA CCCTTGGTTA AC TGCATACA ACTTCTTACA 12 3 

GAAGAATGAT TTGAATCCTA TGTTTCTGGA TCAAGTAGCT AAATTT ATT A TTGATAACAC 130 

AAAAGGTCAA ATGTTGGGAC TTGGGAATCC CAGCTTTTCA GATCCATTTA CAGGTGGTGG 24 0 

TCGGTATGTT CCGGGCTCTT CGGGATCTTC TAACACACTA CCCACAGCAG ATC CTTTT AC 300 

AGGTGCTGGT C GTTATGT AC CAGGTTCTGC AA3TATGGGA AC T AC C ATGG CCGGAGTTGA 360 

TCCATTTACA GGGAATAGTG CCTACCGATC AGCTGCATCT AAAACAATGA ATATTTATTT 42 C 

CCCTAAAAAA GAGGCTGTCA CATTTGACCA AGCAAACCCT ACACAAATAT TAGGTAAACT 48 C 

GAAGGAACTT AATGGAACTG CACCTGAAGA GAAGAAGTTA ACTGAGGATG ACTTGATACT 540 

TCTTGAGAAG AT ACTGTC TC TAATATGTAA TAGTTCTTCA GAAAAACCCA CA3TCCAGCA 600 

ACTTCAGATT TTGTGGAAAG CTATTAACTG TCCTGAAGAT ATTGTCTTTC CT3CACTTGA 6G0 

CATTCTTCGG TTGTCAATTA AACACCCCAG TGTGAATGAG AACTTCTGCA ATGAAAAGGA 72 0 

AGGGGCTCAG TTCAGCAGTC ATCTTATCAA TCTTCTGAAC CCTAAAGGAA AGCCAGCAAA 780 

CCAGCTGCTT GCTCTCAGGA CTTTTTGCAA TTGTTTTGTT GGCCAGGCAG GACAAAAACT 840 

CATGATGTCC CAGAGGGAAT CACTGATGTC CCATGCAATA GAACTGAAAT CAGGGAGCAA 900 

TAAGAACATT CACATTGCTC TGGCTACATT GGCCCTGAAC TATTCTGTTT GTTTTCATAA 960 

AGACCATAAC ATTGAAGGGA AAGCCCAATG TTTGTCACTA ATTAGCACAA TCTTGGAAGT 102 0 

AGTACAAGAC CTAGAAGCCA CTTTT AGACT TCTTGTGGCT CTTGGAACAC TTATCAGTGA 1080 

TGATTCAAAT GCTGTACAAT TAGCCAAGTC TTTAGGTGTT GATTCTCAAA TAAAAAAGTA 1140 

TTCCTCAGTA TCAGAACCAG CTAAAGTAAG TGAATGCTGT AGATTTATCC TAAATTTGCT 1200 

GTAGGAGTGG GGAAGAGGGA CGGATATTTT TAATTGATTA GTGTTTTTTT CCTCACATTT 1260 

GACATGACTG ATAACAGATA ATTAAAAAAA GAGAATACGG TGGATTAAGT AAAATTTTAC 1320 

ATCTTGTAAA GTGGTGGGGA GGGGAAACAG AAATAAAATT TTTGCACTGC TGAAAAAAAA 1380 

AAAAAAAAAA AAAAGGAAAC TCGAGGGGGG GCCCGG 1416 

(2) INFORMATICN FOR SEQ ID NO: 99: 

(i) SEQUENCE CHARACTERISTICS : 

(A) LENGTH: 1935 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNE53 : double 

(D) TOPOLOGY: linear 
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350 

(xi) SEQUENCE DESCRIPTION : SEQ ID NO: 99: 

NTCTACCCTA ATCAAGATGG GGACATACTT CGC^.CCAGG 7TCTTCATOA ACATATCCAG 60 

5 

AGATTGTCTA AAGTAGTGAC TGCAAATCAC AGAGCTCTTC AGATACCAGA GG TTTATCTT 120 

CGAGAAGIAC CATGGCCATC TGCACAATGA GAAATCAGGA CAATAAGTGC TTATAAAACC 180 

10 CCCCGGGACA AAGTGCAGTG CATCCTGA3A ATGTGCTCTA CGATTATGAA CCTCCTGAGC 240 

CTGGCCAATG AGGACTCTGT CCCTGGAGCG GATGACTTTG TTCCTGTGTT GGTGTTTGTG 300 

TTGATAAAGG CAAATCCACC CTGTTTGCTG TCTACTGTGC AGTATATCAG TAGCTTTTAT 360 

15 

GCTAGCTGTC TGTCTGGAGA GGAGTCCTAT TGGTGGATGC AGTTCACAGC AGCAGTAGAA 420 

TTCATTAAAA CCATCGATGA CCGAAAGTGA CCAAGACCAA GGCCCACCAA GGCAGCAGAC 480 

20 TGTTAATCAG ACAAACAGAT CTCTGAGAAG GTGCATCAGC TGCTTTGAAG GCTGAAGATT 540 

GTTTTGTATG ATACTGCACA GCATCAGGCA TTTTAAAGCA GATCITTACT AAACAGGTTA 600 

ATGAGCTAAC AAGCAGGTTC TCTCGTCTTT GGGCTCTTTC CTTTCTGAGT TGCATATTCT 660 

25 

ATTTTCTTGT CCCCAAGTAG AGACTAGTAC TACAAAAAG3 GACCACATTT TTCAAGTATT 72 0 

TCTAAGTATA AAAAACAAAA CAAAAATCTC TTAGGAAATG TCTA3ACCTC CATTCTTGGA 780 

30 TTCCCTTTCT TTCCTTTTAT TTTAAAAAAG AACAGTACCC CTCTTTTAAG ATGCTGTCTT 840 

ACATTAATGA GCATCTAATG GAAAGAAGGT ATGAGTTGCA CTGAGGATTA GAATAGTGGT 900 

GCGTTAGTGG CATTATCTAT AAATACACTC ACCTAAATTG AAAGCTAAGA AGGAAATGTA 96 0 

35 

AATATAATAT ATATTTATAT TTGATGTAAT ATGGACATCT GCAGATTCTA ATAAACAAGG 102 0 

ACTATTGCTG ATAGTAGGCT GTGACATACT GTCTTGTGAA ATGGTTTCCT TGACAAAATT 1080 

40 TAAGCTGAGC TTAAAAGCAA AAAAACAAAA AGTACACAGA AATATTTATT AAAATGTAAT 1140 

ACAGTTTATT GAACTTTCTA GGTATGGAGT TTGATGGACA GGGCTGCCTY TAATGAGTGT 1200 

GAAGGTCACT AAGTCACTTA GACATCTCAC CGTGGAAGTT TGTGAGCCTG CATTAGGAGA 1260 

45 

TAGACTGATT ACCATACATG ACATAAAAAG GAACAGTGGA TAGCTCATAC TTTATGGTGG 1320 

TTCTTCTCCT CCGAAATAAT ATACTGCAGA AATCCCAGAC AGAGCTCCTT ACAAACCTTT 1380 

50 AATTGTAATA TATTTTTGAT GATTATTCAC ATTGAATGCA CAGACCAAGA ATTCAGTGAA 1440 

TGTCATTTTT TAAAAAACTA ATTTGTATTG TCTGCTCTAG TGATACAAGT TTTACTAGTG 1500 

ATAAACTATT TTAATCAACC ATACTATTCT TATGGAAAAA AATATCTATT TTGGCAGGTT 1560 

55 

TCTGTGCCTT TATTTCCCTC TTCTGAAAAA AAGTCTGTGT TTTCATAGTT TGGTTTGCAT 1620 

TGTATATCAA TAATTAATCA GGAATGGGTT TTGGTGCCTG AAAAATTGGC CATGGAGGCA 1680 

60 CACCAAAGCT TCAAGCACAA GTCTTGTACA TGGGCCATCA CTGTCTGGTT TCACTTCGTG 174C 
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TOTTTCCTAA ACACATTTAG CTGCTTTTTT AACAAACTCA GCCCCATACT TGAG7CCC7T 13GG 

GTTGTTGGGA GCATTTCCAG GCATCTTT7A AGGGAACTGT GACAAACAGC CTCGGGCAGA 1860 

TGAACACGGA GGCTCTCTGT TGTCTGTCTC TGAGATCTTT GTGTCTGG3A ATGCCTAAAG 1920 

NTTTTGNTTT TTTTT 193 5 

(2) INFORMATION FCR SEQ ID NO: 100: 

15 (i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 599 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : double 

(D) TOPOLOGY" : linear 

20 

(xi) SEQUENCE DESCRIPTION: SEQ ID NC : 100: 
GAATTCGGCA CGAGCGTCCA CGCAGCCGCC GGCCGGCCAG CACCCAGGGC CCTGCATGCC 60 
25 AGGTCGTTGG AGGTGGCAGC GAGACATGCA CCCGGCCCGG AAGCTCCTCA GCCTCCTCTT 12 0 

CCTCATCCTG ATGGGCACTG AACTCACTCA AGACTCCGCT GCCCCCGACT CCCTGCTGAG 180 
AAGTTCAAAG GGCAGCACGA GGGGG'TCTTT GGCTGCTATT GTCATCTGGA GGGGGAAGAG 240 

30 

TGAGAGCCGG ATAGCCAAGA CCCCAGGCAT TTTCAGAGGT GGCGGGACCT TAGTCCTACC 300 
CCCAACACAC ACCCCTGAGT GGCTCATCCT CCCTTTGGGC ATAACGCTGC CCTTGGGGGC 3 60 

35 TCCAGAAACA GGCGGTGGGG ATTGTGCCGC TGAGACCTGG AAGGGCAGCC AGCGTGCCGG 420 
CCAGCTGTGT C^IATTGCTGG CTTAATATGC AGGGCTTGGG GGGCTGTGGC CACATGCCCG 480 
GCAGGAGGTG AGTGAGGAGC CCTGTGGCGT GCTGGTGTGG GGATCGTGGG CATTTCAAAC 540 
GGGCTTGTCG TACCCTGAAC AATGTATCAA TAGAGAAAAA AAAAAAAAAA AAAACTCGA 599 



40 



45 

(2) INFORMATION FOR SEQ ID NO: 101: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 784 base pairs 
50 (B) TYPE: nucleic acid 

(C) STRANDEDNESS: double 

(D) TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 101: 

55 

GAATTCGGCA CAGAAAAAAA AGAGAGACTG GGTCTTACTG TGTTGCCCAG ACTTGTCTTG 60 
AACTCCTGCC TCAGCCTCTC AAGTACTTGG GATTATAGGC CAAGAAGCCA CCATGCCTAG 120 
60 CTTCTTCCTG TCATTGATCC AGACTAATAC TCTGGGGTCA GCCTCATTTC TTCTCTTTCT 180 
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CACTTTGCAZ ATCCACTTGT CACCAAATCK RGTTCATTCT GCATCCTAAi; TAAGTCCTTT 
GATTCCTCCA GTTGTTCATT AGTAATGTCT CAARTGTAAT TTTTTCTAGT AGTTTTCAGC 

5 

:tgtctttcc kgccttcagt CTTAACTTCT CCAGTACATA KGCCACATTG ttgtcagcak 



GATCAWATTT tatttaaaaa tactttacaw akgtttatkg ccaaatatta graaatacag 

10 ATTCATGGAA AGAAAAATCA CTGTCCCAAG GAGGTCACTG GCATGGTGAj GTTAAGGGGT 



GATTTTA^TT TTTAAAAATG TATATTTTTT CCTGTGTAGA GTAGTAACA2 CCTTGAAAAC 



acawtccctt gtaaagtctc taattctgta ctccgcatct agstgrtct: ttctttctca 

15 

gatattttac aatttcattt atcaccacct ttctctagcc tttacccgtg tcttcaatat 



TWACATATGC AGAAGTTTCT CCTAACAAAC ACCTGCCTCT GCGTCAGTTG TGCTAGCACC 
20 CTGTTGCTTT CTTTCCCTTC ACAATCAAAT TTAAGAGTGT CAAAAAAAAA AAAAAAAAAC 



TCGA 



25 



(2) INFORMATION FCR SEQ ID NO: 102: 

(i) SEQUENCE CHARACTERISTICS: 
30 (A) LENGTH: 1035 base pairs 

(B) TYPE : nucleic acid 

( C ) STRANDEDNES S : doubl e 

(D) TOPOLOGY: linear 

35 (xi) SEQUENCE DESCRIPTION: SEQ ID NO: 102: 



AGAGGCCTGG CTGCGTTGCC CTATCTCCGT CTCCGCCACC CACTTAGCGT TTTAGGCATC 60 



AATTACCAGC AGTTTCTCCG CCACTATCTG GAAAATTACC CGATTGCTCC CGGCAGAATA 120 

40 

CAAGAGCTTG AAGAACGCCG CAGTTGCGTG GAAGCCTGCA GAGCAAGGGA AGCAGCGTTT 180 



GATGCCGAAT ATCAGCGAAA TCCTCACAGG GTGGACCTCG ATATTTTAAC CTTTACGATA 240 
45 GCTCTGACTG CCTCTGAAGT TATCAACCCT CTGATAGAAG AACTTGGTTG CGATAAGTTT 300 



ATCAATAGAG AATAGTTAGG TGGTGACACT ACTTCAAGAG AACCTCTGCA TTCCAGTCAT 360 



ACCAATCCTG CAACTTGATT TTCAGAAGTC AAGAGTATAT CGCGATAAGA CAGTGCACAG 420 

50 

GTGGAGGGGA AAAAAAGGGG GAGGGGGAAG CTTATCTTGA AAAAGCATCA CAGAAGTAGA 480 



AAAAAATGTC GAAAGCATTA TAACTGTAAC GTTCTTTGAG TTTGTGATTG ATCCACATTT 540 
55 TTCCCCCTGC ATTATGGAAA ATGTCTCTCA GCATTGCTTT ATTACAAAGT AAAGGATGGT 600 



TTTATAAAAT TGAGACTGAT GAAACATCAA TACTAGAGCG CATGAGGATG AAAGAAATTA 66 0 



TCAAATAGTG CTGAACAGAA TAAGATGTTA ACGCTGAGTT ATTAGGACTG GAAGGCTATG 720 

60 



24C 
3C0 
360 
420 
480 
540 
600 
660 
720 
730 
734 
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AAAAGAACTT GAAATTGTCG GAA IATGTGC TCTCTTCAT 3 TCATATTCAA TA3AAGTTTC "50 

TAGTTTAAGA TTGATTTTGT GTTTTCTTAG GCATTTCAA 3 TGACAAGCAA A37AAATGTA 34 j 

5 TATATTATGT GATAAATCAT GTTTTCAAGA ACGTCAAATT TCTjGACTTT TTTCTTTCAA 900 

TTTTTAATTT TTAAAGTTTT TTTGGTATTA AAAAATCYAT TCAIAAGCGA AAAAATV.TV.T 950 

WAAATWTWCM GCGAAAAGCC AAAAAAAAAA AAAAMMAGG 3 GGGGCCGGGC CCrATCCCCC 102 0 

10 

caagggggtc cngnt 1035 



15 



25 



(2) INFORMATION FOR SEQ ID NO: 103: 



(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 2213 base pairs 
20 (B) TYPE: nucleic acid 

(C) STRANDEDNESS : double 

(D) TOPOLOGY: linear 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO : 103: 

AGGTATTAGG CCCTTTTGTG GGAGCCCCAT GTTTTGTTTT TCTGAGTTGG TGGGGAGGGA 60 

SGGAGGGGGA GGGCTGAATT GTTTTGCAGA GGAAGATGGC ATCTGTGCTT TAAATTTCTC 120 

30 ATTACTGGGT TAGAAAACAA AGAGGGAKTG CCCTGCACAT TTTCTTTTGT GCTTTTAAAT 180 

GTTTCTTAAG TTGGAACAGG TTTCCTCGGG CCTGTTTTGA CTGATTGCTG GAGTGCATTT 240 

GATAGTTAAA AATTACTAAT TGGTTTTATT TCCCTTCACA CTCTGCCTCC CCACTTCTCC 3 00 

35 

CCCCGTTACT GAAAAATAAC CATTTTAGTG TCAGGCTAGA AATTGAATTG CTGAGTTTTG 360 

TGTATCCTTT AAATTAAAAA CCACAAGTGT TTATTGTAGT GGTTAAACTG TAGCATCTCA 420 

40 GCATCTGGGT GGAAGCTGCC TATATTTCTT CCCAGTTTAA CTGGGGACCA TCTGTGAAAT 480 

TAATTTTCCA TCCAGACAGC TGCTGTGAGC AAATGAACAT AAATGCTCGC TGGAAATTTA 540 

CTAACCAGTT TTTATATTGA CCTGCAGTGT AAAAAGCACA TTTAATTATA AACAATATAT 600 

45 

TCAAAATGGG CAAATTTTAT TTTCAAATGC AGTGTAGAGC TAGATTAAAA GCAACTCTTT 660 

GCCACCTACT CTGCCCTTTT GGCAAAGTTA CCTTGAACAA AGAATCTTAA GGGTTTATTA 72 0 

50 AGAACTCTTT ATTTTCTTCA TACCCTGTTC TCTGCAGTGC TTTCTAACAG CTTCTGGGTG 780 

CAGATTTTCT TCGGCATCCT TTTGCACTCA GCTTATTACA GGTAGGTAGT GCTTAAGAAA 840 

AGTCATGGAG GACTAAAGCC TAAGTCCTTT TCACTTTTCC TCCATCTGAA GGTAGGTGAG 900 

55 

TTCATCCTCT TCATAGTAAT GCTGTTTTAC CAAGACTTTA TAGCAGATGG ACCCAGAAAG 960 

AATTTTCTGC TATTGTGTTC ACTACAACAG GATAGGGACA TCAGACAGCC CCAGAAACCC 1020 

60 CTTCCAGATC TGATATGGGA CTATTAATTT TTATGCTGTT AATTGGTATT CATTCACAAT 10 80 
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GCAGTTGAAG GGGGAAGGCT CCACTGCATT CTTTGGCTAA GGCCTG AATG CTTGCTCATC 114 0 

TGTAAGATCT ATACTCGAGG 7TTTGTTTTC CTTTTAAAAT TCTTTAGGGA GAGAGGGA7G 1^00 

5 

GTTTCTGAGG GGTTCTGAAA GTATGATTCA ATGTGCAACA TACAGGTAGJ TCTTCAGCAT 1260 

AAGCTGAAAT ATATGCATGT A\AAACTTTG AC ATCTTTTT TTTTAATTTT CCACTTTCTT 1320 

10 CTTAACTTTA CTTCTCTTTT TGTCCCCCCC CCATCTTACA GAAGTTGAGG CCAAGGGAGA 1380 

ATGGTAGGCA CAGAAGAA^C ATGGCAAACT GCTCTGTGCT TTCAAACCAA AGTGTTCCCC 1440 

CCAACCCCAA ATTTGTCTAA GCACTGGCCA GTCTGTTGTG GGCATTGTTT TCTACAACCA ^500 

15 

AATTCTGGGT TTTTTTCTTC TTTCTTTAAA CATAGAGGTA CC ACCACAAG GGATGCCCTA 1560 

CTCTCTCGCA GCTCTTGAAA GCATCTGTTT GAGGGAAAGG TCTCTGGGCA AGCAAGTGGT 1620 

20 TATTTGGATT GCTTGCTTCC CTTTTTCCAC CTGGGACATT GYAATCATAA AATAACAGTA 1680 

AATTCCAAAC CTCAAAAACT ATTATGGCCT GAGCACAGCT GAAATCTAGC AGAGTTTAAC 174 0 

^ TCTTCTGCCT CCATGTCTGT CACTTATAAT TCAGGTTCTG CTGTTGGCTT CAGAACATGA 1300 

GCAGAAGAAT CGTTTTATGC TAGTTATTGC ATTCATGGTT GAAACTCAAC TTAGGGAAAG I860 

GGTTCCAATG TATTAAGCAA TGGGCTGCTT CTCCCCAATC CTCCCTAACA ATTCGTTGTG 1920 

30 TGGACTTCTC ATCTAAAAGG TTAGTGGCTT TTGCTTGGGA TCAGTGCTCT CTATTGATGT 1980 

TCTTGCTGGT CTCCAGACAC ATTCCTGTTG CATTAAGACT TGAAAGACTT GTAGATGTGT 2040 



35 



40 



GATGTTCAGG CACAGGATGC TGAAAGCTAT GTTACTATTC TTAGTTTGTA AATTGTCCTT 2100 
TTGATACCAT CATCTTGTTT TCTTTTTGTA GGTATAAATA AAAACACTGT TGACAATAAA 2160 
AAAAAAAAAA AAAAAAAAAA AAAAAAAAAA AAAAAAAAAA AAAAAAAAAA AAAAAAAA 2218 

(2) INFORMATION FOR SEQ ID NO: 104: 

45 (i) SEQUENCE CHARACTERISTICS : 

(A) LENGTH: 1351 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : double 

(D) TOPOLOGY: linear 

50 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 104: 
CTTCACAGAC TGACAGAATG GTTTTGTTTT GTTTTGTTTT GTTTTGTTTT GTTTTTGAGA 60 
55 TGGACTCTAG CTCTGTCACC CAGGCTGGAG TGCAGTGGTG CGATCTCGGC TCACTGCAAG 120 
CTCCGCCTCC CGGGTTCTCA CCATTCTCCT GCCTCAGCCT CCCGAGTAGC TGGGACTACA 180 
GGCGCCCACC ACCACGCCCG GCTAATTTTT TGTAT TTTTT AGTAGAGACG GC^TTTCACC 240 

60 
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10 



15 



20 



40 



45 



50 



ATGTTAGCCA GGATGGTCTC GATCTCCTGA CCTC3TGATC CGCCCSCYTC 3GCCTCGCAA 3C3 

AGTGCTGGGA TTACAGGCGT GAGCCACCGT GCCTGCCCCA GAATGSTTTT TAAAGCCACA 36C 

GTTGAGARGC CACCCATTGC CCGGCGCCTG GACAGTGATC ATCTTGTTCA TCTTGTTCAG 42 C 

TCCTTTCTTG TGTGATTGGA ATTATTCATC CCCTTTGAAA GATGAGAAGG TTGAGATGCA 48 0 

AAGAGTCTAC CTTTCCAAGT TCTCACTGCT GGAAAGARCT AGAAGCACAG TTCAAAGTTC 54 3 



(2) INFORMATION FOR SEQ ID NO: 105: 

(i) SEQUENCE CHARACTERISTICS : 

(A) LENGTH: 2066 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNES S : double 

(D) TOPOLOGY: linear 



55 



600 
660 



?<X»ITTCTGG ACTCTGCAGT CCAGGTYTCC CTTYTCCCAC TTGCCTACCC TCAATGCCAC 
ACTGTTTTTG AAGTGGCCCA TAACTTGAAG GRAAAGTTTA AAGACAGTTC AATTTAATCA 

TCAGRATGCA TTCTTTTTTT TTTCGGARAC GGAKTTTCAC TCTTGCTGCC CASGCTGGAG 720 

TGCAATGGTG CAATGATCTC GGCTCACTGC AACCTATGCC TCCTGGGTTC AAGNGATTAT 780 

CCAGCCTCAG CCTCCCGAGT AGCTGGGATT ATGGGCGCCC ACCACCATGC CCAGCTAATT 840 

TTTGTATTTT TTTTTTTAGT AGAGATGGGG TTTCGCCAGG TTGGCCAGGC TGKTCTTGTG 900 

AAYTCCTGGC YTCAGGTGAT YTGCCCACYT CATCYTCCAA AAGTGCTGGG ATTACAGGCA 960 

25 TGAGCCACTG CGCCTGGCYT CAGAATGCAT TCTTACACAT CTATCCTAGA CATTTATAAG 102 0 

CAGTCTAATG GATAACAATC CAAGAATAAA TGATTGTAAA AGATGATGCC GAAGAGTTGA 1080 

^ TGTCAATCTT TTTTTCCTAA GAAAAAAAGT CCGCGAGTAT TAAATATTTA GATCAATGTT 114C 

TATAAAATGA TTACTTTGTA TATCTCATTA TTCCTATTTT GGAATAAAAA CTGACCTTCT 1200 

TTAATCATAT ACTTGTCTTT TGTAAATAGC AGCTTTTGTG TCATTCTCCC CACTTTATTA 12 60 
35 GTTAATTTAA ATTGGAAAAA ACCCTCAAAC TAATATTCTT GTCTGTTCCA GTCTTATAAA 
TAAAACTTAT AATGCATGTA AAAAAAAAAA A 



1320 
1351 



60 
120 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 105: 
GGCACGAGGC GGCGGAGGGC CACAATCACA GCTCCGGGCA TTGGGGGAAC CCGAGCCGOG 
TGCGCCGGGG GAATCCGTGC GGGCGCCTTC CGTCCCGGTC CCATCCTCGC CGCGCTCCAG 

CACCTCTGAA GTTTTGCAGC GCCCAGAAAG GAGGCGAGGA AGGAGGGAGT GTGTGAGAGG 130 

AGGGAGCAAA AAGCTCACCC TAAAACATTT ATTTCAAGGA GAAAAGAAAA AGGGGGGGCG 240 

60 CAAAAATGGG TGGGGCAATT ATAGAAAACA TGAGCACCAA GAAGCTGTGC ATTGTTGGTG 30 C 
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GGATTCTGCT CGTGTTCCAA ATCATCGCCT TTCTGGTGGG AGGCTTGATT GCTCCAGGGC 3 50 

CCACAAGGGC AGTGTCCTAC ATGTCGGTGA AATGTGTGGA TGCCCGTAA3 AACCATCACA 420 

5 

AGACAAAATG GTTCGTGCCT TGGGGACCCA ATCATTGTGA CAAGATCCGA GACATTGAAG 480 

AGGCAATTCC AAGGGAAATT GAAGCCAATG ACATCGTGTT TTCTGTTCA3 ATTCCCCTCC 54 C 

10 CCCACATGGA GATGAGTCCT TGGTTCCAAT TCATGCTGTT TATCCT3CAG CTGGACATTG 600 

CCTTCAAGCT AAACAACCAA ATCAGAGAAA ATGGAGAA3T irXCATGGAC GTTTCCCTGG 660 

CTTACCGTGA TGACGCATTT GCTGAGTGGA CTGA a ATGGC CCATGAAAGA GTACCACGGA 7 2C 

15 

AACTCAAATG CACCTTCACA TCTCCCAAGA CTCCAGAGCA TGAGG3C03T TACTATGAAT 78 0 

GTGATGTCCT TCCTTTCATG GAAATTGGGT CTGTGGCCCA TAAGTTTTAC CTTTTAAACA 840 

20 TCCGGCTGCC TGTGAATGAG AAGAAGAAAA TCAATGTGGG AATTGGGGAG ATAAAGCATA 900 

TCCGGTTGGT GGGGATCCAC CAAAATGGAG GCTTCACCAA GGTGT3GTTT GCCATGAAGA 960 

CCTTCCTTAC GCCCAGCATC TTCATCATTA TGGTGTGGTA TTGGAGGAGG ATCACCATGA 1020 

25 

TGTCCCGACC CCCAGTGCTT CTGGAAAAAG TCATCTTTGC CCTTGGGATT TCCATGACCT 1080 

TTATCAATAT CCCAGTGGAA TGGTTTTCCA TCGGGTTTGA GTGGACCT'GG ATGCTGCTGT 114 0 

30 TTGGTGACAT CCGACAGGGC ATCTTCTATG CGATGCTTCT GTCCTTCTGG ATCATCTTCT 1200 

GTGGCGAGCA CATGATGGAT CAGCACGAGC GGAACCACAT TGCAGOGTAT TGGAAGCAAG 1260 

TCGGACCCAT TGCGGTTGGC TCCTTCTGCC TCTTCATATT TGACATGTGT GAGAGAGGGG 1320 

35 

TACAACTCAC GAATCCCTTC TACAGTATCT GGACTACAGA CATTGGAACA GAGCTGGC'GA 1380 

TGGCCTTCAT CATCGTGGCT GGAATCTGCC TCTGCCTCTA CTTCCTGTTT CTATGCTTCA 1440 

40 TGGTATTTCA GGTGTTTCGG AACATCAGTG GGAAGCAGTC CAGCCTGCCA GCTATGAGCA 1500 

AAGTCCGGCG GCTACACTAT GAGGGGCTAA TTTTTAGGTT CAAGTTCCTC ATGCTTATCA 1560 

CCTTGGCCTG CGCTGCCATG ACTGTCATCT TCTTCATCGT TAGTCAGGTA ACGGAAGGCC 1620 

45 

ATTGGAAATG GGGCGGCGTC ACAGTCCAAG TGAACAGTGC CTTTTTCACA GGCATCTATG 1680 

GGATGTGGAA TCTGTATGTC TTTGCTCTGA TGTTCTTGTA TGCACCATCC CATAAAAACT 1740 

50 ATGGAGAAGA CCAGTCCAAT GGAATGCAAC TCCCATGTAA ATCGAGGGAA GATTGTGCTT 1800 

TGTTTGTTTC GGAACTTTAT CAAGAATTGT TCAGCGCTTC GAAATATTCC TTCATCAATG 1860 

ACAACGCAGC TTCTGGTATT TGAGTCAACA AGGCAACACA TGTTTATCAG CTTTGCATTT 1920 

55 

GCAGTTGTCA CAGTCACATT GATTGTACTT GTATACGCAC ACAAATACAC TCATTTAGCC 198 0 

TTTATCTCAA AATGTTAAAT ATAAGGAAAA AAGCGTCAAC AATAAATATT CTTGAGTATA 2040 

60 AAAAAAAAAA AAAAAAAAAA AAAAAA 2066 
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5 (2) INFORMATION FOR SEQ ID NO: 1C6: 



( l) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 17 05 base pairs 
(E) TYPE: nucleic acid 
10 (C) STRAtCSENESS : double 

(D) TOPOLOGY : linear 

(xl) SEQUENCE DESCRIPTION : SEQ ID NO: 106: 



15 AATTCGGCAK AGGGCAGCTG TCGGCTGGAA GG AACTGGTC TGCTCACACT TGCTGGCTTG 60 



CGCATCAGGA CTGGCTTTAT CTCCTGACTC ACGGTGCAAA GGTGCACTCT GCGAACGTTA 120 



AGTCCGTCGC CAGCGCTTGG AATCCTACGG CCCCCACAGC CGGATCCCCT CAGCCTTCCA 180 

20 

GGTCCTCAAC TCCCGYGGAC GCTGAACAAT GGCCTCCATG GGGCTACAGG TAATGGGCAT 240 



CGCGCTGGCC GTCCTGGGCT GGCTGGCCGT CATGCTGTGC TGCGCGCTGC CCATGTGGCG 30C 



25 CGTGACGGCC TTCATCGGCA GCAACATTGT CACCTCGCAG ACCATCTGGG AGGGCCTATG 360 



GATGAACTGC GTGGTGCAGA GCACCGGCCA GATGCAGTGC AAGGTGTACG ACTCGCTGCT 420 



GGCACTGCCG CAGGACCTGC AGGC3GCCCG CGCCCTCGTC ATCATCAGCA TCATCGTGGC 430 

30 

TGCTCTGGGC GTGCTGCTGT CCGTGGTGGG GGGCAAGTGT ACCAACTGCC TGGAGGATGA 540 



AAGCGCCAAG GCCAAGACCA TGATCGTGGC GGGCGTGGTG TTCCTGTTGG CCGGCCTTAT 600 



35 GGTGATAGTG CCGGTGTCCT GGACGGCCCA CAACATCATC CAAGACTTCT ACAATCCGCT 660 



GGTGGCCTCC GGGCAGAAGC GGGAGATGGG TGCCTCGCTC TACGTCGGCT GGGCCGCCTC 72 0 



CGGNCTGCTG CTCCTTGGCG GGGGGCTGCT TTGCTGCAAC TGTCCACCCC GCACAGACAA 780 

40 

GCCTTACTCC GCCAAGTATT CTGCTGCCCG CTCTGCTGCT GCCAGCAACT ACGTGTAAGG 840 



TGCCACGGCT CCACTCTGTT CCTCTCTGCT TTGTTCTTCC CTGGACTGAG CTC AGCGC AG 900 



45 GCTGTGACCC CAGGAGGGCC CTGCCACGGG CCACTGGCTG CTGGGGACTG GGGACTGGGC 960 



AGAGACTGAG CCAGGCAGGA AGGCAGCAGC CTTCAGCCTC TCTGGCCCAC TCGGACAACT 1020 



TCCCAAGGCC GCCTCCTGCT AGCAAGAACA GAGTCCACCC TCCTCTGGAT ATTGGGGAGG 1030 

50 

GACGGAAGTG ACAGGGTGTG GTGGTGGAGT GGGGAGCTGG CTTCTGCTGG CCAGGATGGC 1140 



TTAACCCTGA CTTTGGGATC TGCCTGCATC GGTGTTGGCC ACTGTCCCCA TTTACATTTT 1200 



55 CCCCACTCTG TCTGCCTGCA TCTCCTCTGT TGCGGGTAGG CCTTGATATC ACCTCTGGGA 1260 



CTGTGCCTTG CTCACCGAAA CCCGCGCCCA GGAGTATGGC TGAGGCCTTG CCCACCCACC 1320 



TGCCTGGGAA GTGCAGAGTG GATGGACGGG TTTAGAGGGG AGGGGCGAAG GTGCTGTAAA 1380 

60 



10 



15 
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C AGG TTTGGG CAGT3GTGGG GGAGGGOGCC AGAGAGGCGG CTCAGGTTGC CCAGCTCTGT 144 0 

GGCCTCAGOA CTCTCTGCCT CACCCG:TTC AGCCCAGGGC CCCTGGAGAC TGATCCCCTC 15CC 

TGAijTCCTCT GCCCCTTCCA AGGACACTAA 7GAGCCTGGG AGGGT3GCAG GGAGGAGGGG 1560 

ACA3CTTCAC CCTTGGAAGT CCT-JGGGTTT TTCCTCTTCC TTCTTTGTGG TTTCTGTTTT 16 2 0 

GTAATTTAAG AAGAGCTATT CATCACTGTA ATTATTATTA TTTTCTACAA TAAATGGGAC 1680 

CTGTGCACAG GRAAAAAAAA AAAAG 170 5 

(2) INFORMATION FOR SEQ ID NO: 107: 



(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 1167 base pairs 
20 (B) TYPE: nucleic acid 

(C) STRANDEDNESS : double 

(D) TOPOLOGY : linear 



25 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 107: 

TGCAGGAATT CGGCAGAGGT TTTCCGCTAG ACTCTGGCAG TTGGTGAGCA TCATGGCAAC 60 

CGTTACAGCC ACAACCAAAG TCCCGGAGAT CCGTGATGTA ACAAGGATTG AGCGAATCGG 120 

30 TGCCCACTCC CACATCCGGG GACTGGGGCT GGACGATGCC TTGGAGCCTC C<XA.GGCTTC 180 

GCAAGGCATG GTGGGTCAGC TGGCGGCACG GCGGGCGGCT GGCGTGGTGC TGGAGATGAT 240 

CCGGGAAG3G AAGATTGCCG GTCGGGCAGT CCTTATTGCT GGCCAGCCGG GCACGGGGAA 300 

35 

GACGGCCATC GCCATGGGCA TGGCGCAGGC CCTGGGCCCT GACACGCCAT TCACAGCCAT 360 

CGCCGGCAGT GAAATCTTCT CCCTGGAGAT GAGCAAGACC GAGGCGCTGA CGCAGGCCTT 420 

40 CCGGCGGTCC ATCGGCGTTC GCATCAAGGA GGAGACGGAG ATCATCGAAG GGGAGGTGGT 480 

GGAGATCCAG ATTGATCGAC CAGCAACAGG GACGGGCTCC AAGGTGGGCA AACTGACCCT 540 

CAAGACCACA GAGATGGAGA C CATCTACG A CCTGGGCACC AAGATGATTG AKTCCCTGAC 600 

45 

CAAGGACAAG GTCCAGGCCG GGGACGTGAT CACCATCGAC AAGGCGACGG GCAAGATCTC 660 

CAAGCTGGGC CGCTCCTTCA CACGCGCCCG CGAACTACGA CGCTATGGGC TCCCAGACCA 720 

50 AGTTCGTGCA GTGCCCAGAT GGGGAGCTCC AGAAACGCAA GGAGGTGGTG CACACCGTGT 780 

CCCTGCACGA GATCGACGTC ATCAACTCTC GCACCCAGGG CTTCCTGGCG CTCTTCTCAG 840 

GTGACACAGG GGAGATCAAG TCAGAAGTCC GTGAGCAGA? CAATGCCAAG GTGGCTGAGT 900 

55 

GGCGCGAGGA GGGCAAGGCG GAGATCATCC CTGGAGTGCT GTTCATCGAC GAGGTC CACA 960 

TGCTGGACAT CGAGAGCTTC TCCTTCCTCA ACCGGGCCCT GGAGAGTGAC ATGGCGCCTG 1020 

60 TCCAGCAGGT CTATGGGGAT GCCGTGAGGG CTCTGGTAGC TGGTGCCCCG GATTCGCGTG 1030 
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ATGCCACGGT TGGTGGCCT C GTOCCGAATT CC7GCAGCCC GGGGGATCCA CTAGTTCTAG 1140 
AGCGGCCGCC ACCGCGGTGG ANCTCCN 1167 

(2) INFORMATION FCR SEQ ID NO: 108: 



(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 19C7 base pairs 
(E) TYPE: nucleic acid 
(C) STRANDEDNESS : double 
15 (D) TOPOLOGY: linear 

(xi ) SEQUENCE DESCRIPTION: SEQ ID NO: 108: 

GGCACAGGGG AATCATCGTG TGATGTGTGT GCTGCCTTTG TGAGTGTGTG GAGTCCTGCT 60 

20 

CAGGTGTTAG GTACAGTGTG TTTGATCGTG GTGGCTTGAG GGGAACCCTr GTTCAGAGCT 120 

GTGACTGCGG CTGCACTCAG AGAAGCTGCC CTTGGCTGCT CGTAGCGCCG GGCCTTCTCT 130 

25 CCTCGTCATC ATCCAGAGCA GCCAGTGTCC GGGAGGCAGA AGGTACCGGG GCAGCTACTG 240 

GAGGACTGTG CGGGCCTGCC TGGGCTGCCC CCTCCGCCGT GGGGCCCTGT TGCTGCTGTC 300 

CATCTATTTC TACTACTCCC TCCCAAATGC GGTCGGCCCG CCCTTCACTT GGATGCTTGC 360 

30 

CCTCCTGGGC CTCTCGCAGG CACTGAACAT CCTCCTGGGC CTCAAGGGCC TGGCCCCAGC 420 

TGAGATCTCT GCAGTGTGTG AAAAAGGGAA TTTCAACGTG GCCCATGGGC TGGCATGGTC 480 

35 ATATTACATC GGATATCTGC GGCTGATCCT GCCAGAGCTC CAGGCCCGGA TTCGAACTTA 54 0 

CAATCAGCAT TACAACAACC TGCTACGGGG TGCAGTGAGC CAGCGGCTGT ATATTCTCCT 600 

CCCATTGGAC TGTGGGGTGC CTGATAACCT GAGTATGGCT GACCCCAACA TTCGCTTCCT 66 0 

40 

GGATAAACTG CCCCAGCAGA CCGGTGACCG TGCTGGCATC AAGGATCGGG TTTACAGCAA 72 0 

CAGCATCTAT GAGCTTCTGG AGAACGGGCA GCGGGCGGGC ACCTGTGTCC TGGAGTACGC 780 

45 CACCCCCTTG CAGACTTTGT TTGCCATGTC ACAATACAGT CAAGCTGGCT TTAGCGGGGA 840 

GGATAGGCTT GAGCAGGCCA AACTCTTCTG CCGGACACTT GAGGACATCC TGGCAGATGC 900 

CCCTGAGTCT CAGAACAACT GCCGCCTCAT TGCCTACCAG GAACCTGCAG ATGACAGCAG 960 

50 

CTTCTCGCTG TCCCAGGAGG TTCTCCGGCA CCTGCGGCAG GAGGAAAAGG AAGAGGTTAC 1020 

TGTGGGCAGC TTGAAGACCT CAGCGGTGCC CAGTACCTCC ACGATGTCCC AAGAGCCTGA 1080 

55 GCTCCTCATC AGTGGAATGG AAAAGCCCCT CCCTCTCCGC ACGGATTTCT CTTGAGACCC 1140 

AGGGTCACCA GGCCAGAGCC TCCAGTGGTC TCCAAGCCTC TGGACTGGGG GCTCTCTTCA 1200 

GTGGCTGAAT GTCCAGCAGA GCTATTTCCT TCCACAGGGG GCCTTGCAGG GAAGGGTCCA 12 60 

60 
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GGACTTGACA TCTTAAGATG CGTCTTCTCC CCTTGGGCCA GTCATTTCCC CTZTCTGAGC 
CTCGGTGTCT TCAACCTGTG AAATGGGATC ATAATCACTG CCTTACCTCC CTCACGGTTG 
TOTTGAGGAC TGAGTGTGTG GAAGTTTTTC ATAAACTTTG GATGCTAGTG TACTTAGGGG 
GTGTGCCAGG T3TCTTTCAT GGGGCCTTCC AGACCCACTC CCCACCCTTC TCCCCTTCCT 
TTGCCCGGGG ACGCCGAACT CTCTCAATGG TATCAACAGG CTCCTTCGCC CTCTGGCTCC 
TG3TCATGTT CCATTATTGG GGAGCCCCAG CAGAAGAATG GAGAGGAGGA GGAGGCTGAG 
TTTGGGGTAT TGAATCCCCC GGCTCCCACC CTGCAGCATC AAGGTTGCTA TGGACTCTCC 
TGCCGGGCAA CTCTTGCGTA ATCATGACTA TCTCTAGGAT TCTGGCACCA CTTCCTTCCC 
TGGCCCCTTA AGCCTAGCTG TGTATCGGCA CCCCCACCCC A^TAGAGTAC TCCCTCTCAC 
TTGCGGTTTC CTTATACTCC ACCCCTTTCT CAACGGTCCT TTTTTAAAGC ACATCTCAGA 
TTAAAAAAAA AAAAAAAAAA AAAAAAAAAA AAAAAAAGGG CGGCCGC 

(2) INFORMATION FOR SEQ ID NO: 109: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 611 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : double 
<D) TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 109: 

ATGAATTAAC GCCAAGCTNT NAATAGGGAC TCACTATGGG GGAAAGNTGG GTAACGCCTG 

CAGGTACCGT TCCGGAATTC CCGGGTCGAC CCACGCGTCC GATGGGGCTT TAGTAAATCA 

GGCTTGCAGG CTCAAAGCTG CAATCTGCCC ACTCTCAGGT ACTGAGACTT TGTGGGCCTC 

AGACACCAGG AAGAAAGTTG GGATACAGTC ATTTGAGTTA AAAAGGGAAT GACCCCTCAG 

AAACCCGCAT TAGCAGTGTT ACTCTTGGAA GTGCCTTTAC TTTTAACGCT CTCTGTTCTG 

AAAAAGAGGT GTTTGGTTAC GTGTGAGCCA ACATCACGTT TTGTTAGCTG TGATTTACCT 

TTGTCCGTTT AAAAGACTTC ACGGAGCCAT TCTGTATACA AGGTGTGCTC TTTCCAATGT 

AGAAGGGGTT ATGGAAAAGG GTGCGATCCT TTGCTGTAAA CTGGAGAGAC CAGTCCCAAA 

CAGAGGGGAA TTTTAAGCCC TTCTCATCAC CCAATTGGAT GTTTTTGCTT ATAGCAAATT 

CCTGCAAAAT AAATAAATAA ATATTTGCAA AACTAAAAAA AAAAAAAAAA AAAAAAAAAA 

GGGGGGNCCN C 
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;:; information for seq id no: i:o: 

( i ) SEQUENCE CHARACTERISTICS : 

(A! LENGTH: 2632 base pairs 
(3) TYPE: nucleic acid 
iC) STRANDEDKESS : do -able 
(D) TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 110: 

TCCCA3CTCT CAGGACAAGG GCCCTGGGCG ATCTTTTAAA AAAGCCGATT GGGTGTCTTT 60 

CTAAAANTAC AACCAGTACT TCATCGTCAA GTTTCTGGGA AGGGAGTCCC CTCCAGATTC 12 0 

15 TCATGGAGTG ACAAATCTTG ACTCTTGCTC CTGGAATTTT TCAGGCCCAA ACTAGCGTTT 180 

CTACAATGAT TTATTTGGCA AATTTGTCTT GATTATGGGT GGCTGATGAG GAACGTGCTT 2 40 

TTGTTAGGAA CCGAAACTGG GCGGCGGTGA GGGCGTGTAC G^AATGAGTC CGGAAGAGGG 300 

20 

TGAAATGCTT TCGGTAGGCA CTCCACGGCT GTGAAGATGG CGGCGGCTGC GTGjCTTCAG 360 

GTGTTGCCTG TCATTCTTCT GCTTCTGGGA GCTCACCCGT CACCACTGTC GTTTTTCAGT 420 

25 GCGGGACCGG CAACCGTAGC TGCTGCCGAC CGGTCCAAAT GGCACATTCC GATACCGTCG 480 

GGGAAAAATT ATTTTAGTTT TGGAAAGATC CTCTTCAGAA ATACCACTAT CTTCCTGAAG 540 

TTTGATGGAG AACCTTGTGA CCTGTCTTTG AATATAACCT GGTATCTGAA AAGCGCTGAT 600 

30 

TGTTACAATG AAATCTATAA CTTCAAGGCA GAAGAAGTAG AGTTGTATTT GGAAAAACTT 660 

AAGGAAAAAA GAGGCTTGTC TGGGAAATAT CAAACATCAT CAAAATTGTT CCAGAACTGC 720 

35 AGTGAACTCT TTAAAACACA G AC CTTTTC T GGAGATTTTA T5CATCGACT GCCTCTTTTA 780 

GGAGAAAAAC AGGAGGCTAA GGAGAATGGA ACAAACCTTA CCTTTATTGG AGACAAAACC 84 0 

GCAATGCATG AACCATTGCA AACTTGGCAA GATGCACCAT ACATTTTTAT TGTACATATT 90 0 

40 

GGCATTTCAT CCTCAAAGGA ATCATCAAAA GAAAATTCAC TGAGTAATCT TTTTACCATG 960 

ACTGTTGAAG TGAAGGGTCC CTATGAATAC CTCACACTTG AAGACTATCC CTTGATGATT 1020 

45 TTTTTCATGG TGATGTGTAT TGTATATGTC CTGTTTGGTG TTCTGTGGCT GGCATGGTCT 1080 

GCCTGCTACT GGAGAGATCT CCTGAGAATT CAGTTTTGGA TTGGTGCTGT CATCTTCCTG 1140 

GGAATGCTTG AGAAAGCTGT CTTCTATGCG GAATTTCAGA ATATCCGATA CAAAGGARAA 1200 

50 

TCTGTCCAGG GTGCTTTGAT CCTTGCAGAR CTGCTTTCAG CAGTGAAACG CTCACTGGCT 1260 

CGAACCCTGG TCATCATAGT CAGTCTGGGA TATGGCATCG TCAAGCCACG CCTGGAGTCA 1320 

55 CTCTTCATAA GGTTGTAGTA GCAGRAGCCC TCTATCTTTT GTTCTCTGGC ATGGAAGGGG 1380 

TCCTCAGAGT TACTGGGGCC CAGACTGATC TTGCTTCCTT 'GGCCTTTATC CCCTTGGCTT 1440 

TCCTAGACAC TGCCTTGTGC TGGTGGATAT TTATTAGCCT GACTCAAACA ATGAAGCTAT 1500 

60 
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TAAAACTTCG GAGGAACATT GTAAAACTCT CTTTGTATCG GCAT7TCACC AACACGCTTA 156 0 

TTTTGGCAGT 3GCAGCATCC ATTGTGTTTA TCATCTGGAC AACCATGAAG TTCAGAATAG 1620 

TGACATGTCA GTCGGACTGG CGGGAGCTGT GGGTAGACGA TGC CATCTGG CGCTTGCTGT 163 0 

TCTCCATGAT CCTCTTTGTC ATCATGGTTC TCTGGCGAC Z ATCTGCAAAC AACCAGAGGT 1740 

TTGCCTTTTC ACCATTGTCT GAGGAAGAGG AGGAGGATGA ACAAAAGGAG CCTATGCTGA 1800 

AAGAAAGCTT TGAAGGAATG AAAATGAGAA GTACCAAACA AGAACCCAAT GGAAATAGTA 1360 

AAGTTAACAA AGCACAGGAA GATGATTTGA AGTGGGTAGA AGAGAATGTT CCTTCTTCTG 192 0 

15 TGACAGATGT AGCACTTCCA GCCCTTCTGG ATTCAGATGA GGAACGAATG ATCACACACT 19B0 

TTGAAAGGTC CAAAATGGAG TAAGGAATGG GAAGATTTGC AGTTAAAGAT GGC TAG CATC 2 040 

AGGGAAGAGA TCAGCATCTG TGTCAGTCTT CTGTACGGCT CCATGGGATT AAAGGAAGCA 2100 

20 

ATGACATCCT GATCTGTTCC TTGATCTTTG GGCATTGGAG TTGGCGAGAG GTGTC AGAAC 2160 

AAAGAGAACA TCTTACTGAA AACAAGTTCA TAAGATGAGA AAAATCTACG AGCTTCTTAT 2220 

25 TTACAACACT GCTGCCCCCT TTCCTCCCAG ACTCTGACAT GGATGTTCAT GCAACTTAAG 2280 

TGTGTTGTTC CTGAACTTTC TGTAATGTTT CATTTTTTAA ATCTGACAAA CTAAAAAGTT 2340 

TAACGTCTTC TAAAAGATTG TCATCAACAC CATAATATGT AATCTCCAGG AGCAACTGCC 2400 

30 

TGTAATTTTT ATTTATTTAG GGAGTTACAT AGGTGATGGG GGAAATTGTT AACTACCTTT 2460 

CATTTTCCTG GGAAGTCAAG GTTACATCTT GCAGAGGTTG TTTTGAGAAA AAAGGGCCCT 2520 

35 TCTGAGTTAA GGAGCCATAG TTCTATCAAT GATCAAAAGA AAAAAAAAAA AACTCGATCG 2580 

GCACGAGGGG GGGCCCGGTA CCCAATTCGC CCTATGGGAN TCGAATGAGA CC 2632 



(2) INFORMATION FOR SEQ ID NO: 111: 



(i) SEQUENCE CHARACTERISTICS: 
45 (A) LENGTH: 2249 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : double 

(D) TOPOLOGY : linear 

50 (xi) SEQUENCE DESCRIPTION: SEQ ID NO: 111: 

GAATTCGGCA CGAGCTCACC GTGCTGCGTG ACACAAGGCC AGCCTGCGCC TACGAGCCCA 60 

TGGACTTTKT RATGGCCCTC ATCTACGACA TGGTACTGSW TGTGGTCACC CTGGGGCTGG 120 

55 

CCCTCTTCAC TCTGTGCGGC AAGTTCAAGA GGTGGAAGCT GAACGGGGCC TTCCTCCTCA 180 

TCACAGCCTT CCTCTCTGTG CTCATCTGGG TGGCCTGGAT GACCATGTAC CTCTTCGGCA 240 

60 ATGTCAAGCT GCAGCAGGGG GATGCCTGGA ACGACCCCAC CTTGGCCATC ACGCTGGCGG 3 00 
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CCAGCGCTGG GTCTTCGTCA TCTTCCACGC CATCCCTGAG ATCCAGTC-IA CCCTTCTGCC 36 C 

AGCCCTGCAG GAGAACACGC CCAACTACTT CGACACGTCG CAGCC-AGGA TGCGGGAGA: 420 

5 

GGCCTTCGAG GAGGACGTGC AGCTGCCGCG GGCCTATATG GAGAACAAOG CCTTCTC ZAT 430 

GGATGAAGAC AATGC AGCTC TCCGAACA3C AGGATTTCCC AACGGCAGrT TGGGAAAAAG 540 

10 ACCCAGTGGC AGCTTGGGGA AAAGACCCAG CGCTCCGTTT AGAAGCAACG TGTATCAGC2 600 

AACTGAGATG GCCGTCGTGC TCAACGGTGG GACCATCCCA ACTGCTCCGC CAAGTCACAC 660 

AGGAAGAMAC CTTTGGTGAA AGACTTTAAG TTCCAGAGAA TCAGAATTTC TCTTACCGAT 72 0 

15 

TTGCCTCCCT GGCTGTGTCT TTCTTGAG3G AGAAATCGGT AACAGTTGCC GAACCAGGCC 7 80 

GCCTCACAGC CAGGAAATTT GGAAATCCTA GCCAAGGGGA TTTCGTGTAA ATGTGAACAC 840 

20 TGACGAACTG AAAAGCTAAC ACCGACTGCC CGCCCCTCCC CTGCCACACA CACAGACACG 900 

TAATACCAGA CCAACCTCAA TCCCCGCAAA CTAAAGCAAA GCTAATTGCA AATAGTATTA 960 

GGCTCACTGG AAAATGTGGC TGGGAAGACT GTTTCATCCT CTGGGGGTAG AACAGAACCA 1020 

25 

AATTCACAGC TGGTGGGCCA GACTGGTGTT GGTTGGAGGT GGGGGGCTCC CACTCTTATC 108 0 

ACCTCTCCCC AGCAAGTGCT GGACCCCAGG TAGCCTCTTG GAGATGACCG TTGCGTTGAG 1140 

30 GACAAATGGG GACTTTGCCA CCGGCTTTGC CTGGTGGTTT GCACATTTCA GGGGGGTCAG 1200 

GAGAGTTAAG GAGGTTGTGG GTGGGATTCC AAGGTGAGGC CCAACTGAAT CGTGGGGTGA 1260 

GCTTTATAGC CAGTAGAGGT GGAGGGACCC TGGCATGTGC CAAAGAAGAG GCCCTCTGGG 1320 

35 

TGATGAAGTG ACCATCACAT TTGGAAAGTG ATCAACCACT GTTCCTTCTA TGGGGCTCTT 1380 

GCTCTAGTGT CTATGGTGAG AACACAGGCC CCGCCCCTTC CCTTGTAGAG CCATAGAAAT 1440 

40 ATTCTGGCTT GGGGCAGCAG TCCCTTCTTC CCTTGATCAT CTCGCCCTGT TCCTACACTT 1500 

ACGGGTGTAT CTCCAAATCC TCTCCCAATT TTATTCCCTT ATTCATTTCA AGAGCTCCAA 1560 

TGGGGTCTCC AGCTGAAANS CCCTCCGGGA GGCAGGTTGG AAGGCAGGCA CCACGGCAGG 1620 

45 

TTTTCCGCGA TGATGTCACC TAGCAGGGCT TCAGGGGTTC CCACTAGGAT GCAGAGATGA 1680 

CCTCTCGCTG CCTCACAAGC AGTGACACCT CGGGTCCTTT CCGTTGCTAT GGTGAAAATT 1740 

50 CCTGGATGGA ATGGATCACA TGAGGGTTTC TTGTTGCTTT TGGAGGGTGT GGGGGATATT 1800 

TTGTTTTGGT TTTTCTGCAG GTTCCATGAA AACAGCCCTT TTCCAAGCCC ATTGTTTCTG 1860 

TCATGGTTTC CATCTGTCCT GAGCAAGTCA TTCCTTTGTT ATTTAGCATT TCGAACATCT 1920 

55 

CGGCCATTCA AAGCCCCCAT GTTCTCTGCA CTGTTTGGCC AGCATAACCT CTAGCATCGA 1980 

TTCAAAGCAG AGTTTTAACC TGACGGCATG GAATGTATAA ATGAGGGTGG GTCCTTCTGC 2040 

60 AGATACTCTA ATC ACT ACAT TGCTTTTTCT ATAAAACTAC CCATAAGCCT TTAACCTTTA 2100 
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AGT AC GTCTG AC-CTGAGTAT GTT7GAATAA AC CTTTTG AT ATTTCTCAAA AAAAAAAAAA 2 22C 

5 

AAAAA>;CCCG GGGC-GGGC-:C CC-GACCTGG 2249 



10 

(2) INFCSMATICN FC? SZQ ~ -C : 112: 

(i) SSCCZNCI CC-CA-AC7Z?.CSC:CS: 

(A) LTOIGCC-: : 219 5 case pairs 
15 {3} TYPE: r.uclei- scid 

(C) STHATdZCriZSS: dcuble 
O) TOPOLOGY: linear 

(xi) SEQUENCE CZSC?d PTICM : GZQ ID NO: 112: 

20 

GAT ACT AT AA GGCAA.GTGAC 7CACGGGTGC GCCGTTAGAC TAGTGGATCC CGGGTGCAGG 60 

AATTCGGCA.G AGCGCCGGCG GAC-CC GAAGT GCTGGCGCCC CCGCGGCCGC TGCCTCCGCG 120 

25 GANCCCAAAA TCATGAA-.GT C AC CGTGAAG AC G CCGAAG A AAAGGAGGAA TTCGCCGTGC 180 

CCGAGAATAG CTCCGTCCAG CACCTTAAGG AAGAAATCTC TAAACGTTTT AAATCACATA 240 

CTGACCAAGT TTTCTTGACA TTTGCTGGAA AAA-TTTTGAA AGATCAAGAT ACCTTGAGTC 300 

30 

AGCATGGAAT TCATGATGGA GTTAGTGTTC AC CTTGTC AT TAAAACACAA AACAGGCCTC 360 

AGGATCATTC AGCTCAGCAA ACAAATACAG C7GGAAGCAA TGTTACTACA TCATCAACTC 420 

35 CTAATAGTAA CTCTACACCT GGTCCTGCTA CTAGCAACCC TTTTGGTTTA GGTGGCCTTG 480 

GGGGACTTGC AGGTCTGA3T AC-CTC GGGTT TGAATACTAC CAACTTCTCT GAACTACAGA 540 

GTCAGATGCA GCGAjCAACTT TTGCCTAACC CTGAAATGAT GGTCCAGATC ATGGAAAAWC 600 

40 

CCYTTGTTCA G.-.GCATGCTC :7?CAAATCCT GACCTGATGN AGACAGTTAA TTATGGCCAA 660 

TCCACAAATG CAGCAGTTGA TACAGAGAAA TCCCAGAAAT TAGTCATATG TTGAATAATC 72 0 

45 CAGATATAAT G-GACAAACG TTGGAACTTG CCC AGGAATC CAGCAATGAT GCAGGAGATG 780 

ATGAGGAACC AGGACCG-.GC TTTGAGCAAC CTAGAAAGCA TCCCAGGGGG ATATAATGCT 840 

TTAAGGCGCA TGTACACAGA TAT7CAGGAA CCAATGCTGA GTGCTGCACA AGAGCAGTTT 900 

50 

GGTGGTAATC C-.TTTGCTTC C7TTGGTGAGC AATACATCCT CTGGTGAAGG TAGTCAACCT 960 

TCCCGTACAG AAAATAG-GA TCG-£TACCC AATCCATGGG CTCCACAGAC TTCCCAGAGT 102 0 

55 TCATCAGCTT CCAGCGGCAC TOCCAGCACT GTGGGTGGCA CTACTGGTAG TACTGCCAGT 1080 

C<XTACTTCTG C-GCAGAGTAC TACTGCGCCA AATTTGGTGC CTGGAGTAGG AGCTAGTATG 1140 

TTCAACACAC CAGGAA7GCA GAC-CTTGTTG CAACAAATAA CTGAAAACCC ACAACTTATG 1200 

60 
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CAAAACATGT TGTCTGCCCC CTACATGAGA AGCA7GATGC AGTCACTAAG CCA3AATCC7 1260 



GACCTTGCTG CACAGATGAT GCTGAATAAT CCCCTATTTG CTGGAAATCC T ,'AGCTTCAA 132 0 



5 GAACAAATGA GACAACAGCT CCCAACTTTC CTCCAACAAA TGCAGAATCC TGA7ACACTA 13 8C 



TCAGCAATGT CAAACCCTAG AGC AATGC AG GCCTTGTTAC AGATTCAGCA G3GTTTACAG 1440 



ACATTAGCAA CGGAAGCCCC GGG2CTCATC CCAGGGTTTA CTCCTGGCTT GGGGGCATTA 15C0 

10 

GGAAGCACTG GAGGCTCTTC GGGAACTAAT GGATCTAACG CCACACCTAG TGAAAACACA 1560 



AGTCCCACAG CAGGAACCAC TGAACCTGGA CATCAGCAG7 TTATTCAGCA GATGCTGCAG 162 0 



15 GCTCTTGCTG GAGTAAATCC TCA3CTACAG AATCCAGAAG TCAGATTTCA G2AACAACTG 1690 



GAACAACTCA GTGCAATGGG ATTTTTGAAC CGTGAAGCAA ACTTGCAAGC TCTAATAGCA 1740 



ACAGGAGGTG ATATCAATGC AGCTATTGAA AGGTTACTGG GCTCCCAGCC ATCATAGCAG 1300 

20 

CATTTCTGTA TCTKGAAAAA ATGTAATTTA TTTTTGATAA CGGCTCTTAA ACTTTAAAAT 1860 



ACCTGCTTTA TTTCATTTTG ACTCTTGGAA TTCTGTGCTG TTATAAACAA ACCCAATATG 1920 



25 ATGCATTTTA AGGTGGAGTA CAGTAAGATG TGTGGGTTTT TCTGTATTTT TCTTTTCTGG 1980 



AACAGTGGGA ATTAAGGCTA CTGCATGCAT CACTTCTGCA TTTATTGTAA TTTTTTAAAA 2040 



ACATCACCTT TTATAGTTGG GTGACCAGAT TTTGTCCTGC ATCTGTCCAG TTTATTTGCT 2100 

30 

TTTTAAACAT TAGCCTATGG TAGTAATTTA TGTAGAATAA AAGCATTAAA AAGAAGCAAA 2160 
AAAAAAAAAA AAAAATTCCT GCGCCCGGGA ATTCTTCT 2198 

35 

(2) INFORMATION FOR SEQ ID NO: 113: 

40 (i) SEQUENCE CHARACTERISTICS : 

(A) LENGTH: 1043 base pairs 

(B) TYPE: nucleic acid 
<C) STRANDEDNESS : double 
(D) TOPOLOGY: linear 

45 

<xi) SEQUENCE DESCRIPTION: SEQ ID NO: 113: 
CTGAAGTGTA TGTGGTGAGG AAGAAGAGGC TCCTACTGTA GACAGCCTTG TTCTACAGAT 60 



50 CCTCCCAGAA ATCTCTGGGC CAGGTGGAAC CCAGGGTCAG AGAGGGATGG GAGAGAGGTT 12 0 



TAATTTTCCA TGATAAATAA AAATCTATAA AATAATAAAC AAGAGAAAAG AGATTGGAAA 180 



CAGCCAGGTT GGAGCAGTGA GTGAGTAAGG AAACCTGGCT GCCCTCTCCA GATTCCCCAG 24 0 

55 

GCTCTCAGAG AAGATCAGCA GAAAGTCTGC AAGACCCTAA GAACCATCAG CCCTCAGCTG 300 



CACCTCCTCC CCTCCAAGGA TGACAAAGGC GCTACTCATC TATTTGGTCA GCAGCTTTCT 360 
60 TGCCCTAAAT CAGGCCAGCC TCATCAGTCG CTGTGACTTG GCCCAGGTGC TGCAGCTGGA 420 
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RGACTTGGAT 3GGTTTGAGG GTTACTCCCT GAGTGACTGG CTGTGCCTGG CTTTTGTGGA 430 

AAGCAAGTT^ AACATATCAA AGATWAATGA AAATGCAGAT GGAAG3TTTG ACTATGGSCT 540 

5 

CTTCCAGATC AACAGCCACT ACTGGTGCAA CRATTATAAG AGTTAOTCGG AAAACCTTTG 600 

CCACGTAGAC TGTCAAGATC TGCTGAATCC CAACCTTCTT GCAGGCATCC ACTGCGCAAA 660 

10 AAGGATTGTG TCCGGAGCAC GGGGGATGAA CAACTGGGTT AGAATGGAAG KTTGCACTGT 720 

TCAGGCCGGC CACTCTTCTA CTGGCTGACA GGATGCCGCC TGAGATKAAA CARGGTGCGG 7 80 

GTGCACCGTG GARTCATTCC AAGACTCCTG TCCTCACTCA RGGATTCTTC ATTTCTTCTT 840 

15 

CCTACTGCCT C CACTTCATG TTATTTTCTT CCCTTCCCAT TTACAACTAA AACTGACCAG 900 

AGCCCCAGGA ATAAATGGTT TTCTTGGCTT CCTCCTTACT CCCATCTGGA CCCAGTCCCC 960 

20 TGGTTCCTGT CTGTTATTTG TAAACTGAGG ACCACAATAA AGAAATCTTT ATATTTATCG 102 0 

AAAAAAAAAA AAAAAAAACT CGA 1043 



25 

(2) INFORMATION FOR SEQ ID NO: 114: 

(i) SEQUENCE CHARACTERISTICS : 
30 (A) LENGTH: 7 03 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : double 

(D) TOPOLOGY: linear 

35 (xi) SEQUENCE DESCRIPTION: SEQ ID NO: 114: 

GAATTCGGCA CGAGTGCGCG GGCACCACGG CGGTTTTTCG ACGCTGGCGG TGGACGCAGG 6 0 

CAGCATGGAC CACGGTTGCT GGGCGGATGG GGAGCGTCTA TGGTCAGTTG CCTTAGAAGT 120 

40 

GGTGAGATGG GAAGCTGCAG TTGGAAGACC CTGGAGGATG CCTGACAAGG GGATGTCTGA 130 

CACATGATTG GAGCTCTTTT TGAAATGTTT CTTGCCCTTC CTGGAGCAGA GGAGCCATTA 24 0 

45 TTTATGCAGG TACATCGAAG TCTTTTGACC TCCATACAGT GATTATGCTT GTCATCGCTG 300 

GTGGTATCCT GGCGGCCTTG CTCCTGCTGA TAGTTGTCGT GCTCTGTCTT TACTTCAAAA 360 

TACACAACGC GCTAAAAGCT GCAAAGGAAC CTGAAGCTGT GGCTGTAAAA AATCACAACC 420 

50 

CAGACAAGGT GTGGTGGGCC AAGAACAGCC AGGCCAAAAC CATTGCCACG GAGTCTTGTC 480 

CTGCCCTGCA GTGCTGTGAA GGATATAGAA TGTGTGCCAG TTTTGATTCC CTGCCACCTT 540 

55 GCTGTTGCGA CATAAATGAG GGCCTCTGAG TTAGGAAAGG TGGGCACAAA AATCTTCATG 600 

AGCAATACTT CTTAGTAGAT TGTTTTGTTA TTCAAATCAA GTTCTAGTGT TTTTATGTGA 660 

GATTATATAA TTTACAGTGT TGTTTTATAT ACTTTTGAAT AAA 70 3 

60 
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[2] INFO RKAT Z ON FOR SEQ ID MO: 115: 

5 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH : 3634 base pairs 
(3) TYPE: nucleic acid 
(C) STRANDEDNESS : double 
10 (O) TOPOLOGY: linear 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 115: 



GGCAGAGGGG GCATGAGCAG GAGGAGGATT ACCGCTACGA GGTGCTCACG GCCGAGCAGA 60 

15 

TTCTACAACA CATGGTGGNA ATGTATCCGG GAGGTCAACG AGGTCATCCA GAATCCAGCA 12 0 



ACTATCACAA GAATACTCCT TAGCCACTTC AATTGGGATA AAGAGAAGCT AATGGAAAGG 180 



20 TACTTTGATG GAAACCTGGA GAAGCTCTTT GCTGAGTGTC ATGTAATTAA TCCAAGTAAA 24 0 



AAGTCTCGAA CACGCCAGAT GAATACAAGG TCATCAGCAC AGGATATGCC TTCTCAGATC 300 



TGCTACTTGA ACTACCCTAA CTCGTATTTC ACTGGCCTTG AATGTGGACA TAAGTTTTGT 360 

25 

ATGCAGTGCT GGAGTGAATA TTTAACTACC AAAATAATGG AAGAAGGCAT GGGTCAGACT 420 



ATTTCGTGTC CTGCTCATGG TTGTGATATC TTAGTGGATG ACAACACAGT TATGCGCCTG 480 



30 ATCACAGATT CAAAAGTTAA ATTAAAGTAT CAGCATTTAA TAACAAATAG CTTTGTAGAG 540 



TGCAATCGAC TGTTAAAGTG GTGTCCTGCC CCAGATTGCC ACCATGTTGT TAAAGTCCAA 600 



TATCCTGATG CTAAACCTGT TCGCTGCAAA TGTGGGCGCC AATTTTGCTT TAACTGTGGA 660 

35 

GAAAATTGGC ATGATCCTGT TAAATGTAAG TGGTTAAAGA AATGGATTAA AAAGTGTGAT 720 



GATGACAGTG AAACCTCCAA TTGGATTGCA GCCAACACAA AGGAATGTCC CAAATGCCAT 780 



40 GTCACAATTG AGAAGGATGG TGGTTGTAAT CACA'TGGTCT GTCGTAACCA GAATTGTAAA 840 



GCAGAGTTTT GCTGGGTGTG TCTTGGCCCA TGGGAACCAC ATGGATCTGC CTGGTACAAC 900 



TGTAACCGCT ATAATGAGGA TGATGCAAAG GCAGCAAGAG ATGCACAGGA GCGATCTAGG 960 

45 

GCAGCCCTGC AGAGGTACCT GTTCTACTGT AATCGCTATA TGAACCACAT GCAGAGCCTG 1020 



CGCTTTGAGC ACAAACTATA TGCTCAGGTG AAACAGAAAA TGGAGGAGAT GCAGCAGCAC 1080 



50 AACATGTCCT GGATTGAGGT GCAGTTCCTG AAGAAGGCAG TTGATGTCCT CTGCCAGTGT 1140 



CGTGCCACAC TCATGTACAC TTATGTCTTC GCTTTCTACC TCAAAAAGAA TAACCAGTCC 1200 



ATTATCTTTG AGAATAACCA AGCAGATCTA GAGAATGCCA CAGAGGTGCT CTCGGGCTAC 1260 

55 

CTTGAACGAG ATATTTCCCA AGATTCTCTG CAGGATATAA AGCAGAAAGT ACAAGACAAG 1320 



TACAGATACT GTGAGAGTCG ACGAAGGGTT TTGTTACAGC ATGTGCATGA AGGCTATGAA 1380 
60 AAAGATCTGT GGGAGTACAT TGAGGACTGA GAATGGCCCT GCATAAAATG AACTCTGAAA 1440 
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ACTTTACCAT CTAGAG7GCT CA7GCAATTA AAACAAAACA AACACAAACA AGGAGGCACT 15 CC 

AAGCCTATTC TGACACCACT GGTCTGTAGT ACCAGAATTG TTTTGTTAAT GGAAAGTTTA 156C 

5 

AGTAAATTAT ATTGTAATAA AAAGGTAGAT AAACCATTGT ACAACAGTAT TCTAGGCCGC 1620 

CAACAAAAGT GTGACAGACA CACTAAAAGC CCTCCAACTT TAACTTGTAA CGTAGCTTCA 1680 

10 TTCTCAAAGC TGACTCCTTT TTTTTCTTTT TCCrrTTCCT GAGTGTAGTA CA3TTAAAA7 1740 

TTCAAACAGC TCCTTGACAC TGCTTTTCAT GTTCAAACCA GCCATTTTGT TGTACTTTGG 1800 

TAAAGGACCT CTTCCCCTTC CTCCCCTACA CATACAGATA CACCCACACA CAGACTGACT 1860 

15 

CTCTTTCTCT CATACCCCAA GGTCATGAGT GAATGATGCT TAGTTCCTTG TAAAGAAAAT 1920 

CTTGGGATGG GGAAAGGGGT AGGCAGCAAG AGGATTCAAC AAACGAAAAA CATAAAAACT 1980 

20 TTGTATATGA CTTTTAAAAC AAGAGGACAA CACAGTATTT TTCAAAATTG TATATAGCGC 204 0 

ATATGCATGG ACAAAGCAAG CGTGGCACGT GTTTGCATAA TGTTTAATTA CAAAAAAATA 2100 

TTTATTCTTT AAAAATCTTC AAGATTATGT CTATTTGCTG TGCATTTTCT TTCAGTTTGC 2160 

25 

TTATCTTTCC CGGGTTGGGG TTGGGATAAA GGTGTGTCGG TTTAGCACCT CTGGAAGACC 2220 

TATCTAGAGC TCTTTCACTT TCCTGAGGTT ATTTTGCCCY TTCTGGTGTT GGTATGTCTG 22 80 

30 TTGCCGGCCA TGGGCTNCAY GCCTTGAATT CCTGCTCTTG ATCAGGGACA AGGGAGGTCA 2340 

AGCTCTGACT AATGCCATGA CCTGATTAAG GGGTACAGCA GGGAGTTTTG TTGCTACAGC 2400 

TCATGAATTA ACCTGTCCCA ACCTAATCCC CCTCCATGGC ATCATGCCTC TACCCAAGCC 2460 

35 

TTTGTGTGCC CATGTTATGC ACACAGCTGT AGGCATTCTT AAGTCCCCTG TCGCATCCAG 2520 

TGGAAGCATT TTAAAATTTC TTTTACTTTT TGGTTTTCCC TTAATTGCTG CTTTTCAGAT 2580 

40 TTTAGTTATG GCTCGTCTGC TCACCCCTTC TCTACATTAG GGTGTCAAAG AGAATGTTTT 2640 

GCTTTAAATA TAAATAGCCA TTCATTTAGT CTCAGATTGT GAATTTAAAA TGGTGGATAC 2700 

CGAAATTGCT TGTGTGTGTT GCTGTGGGTT TGGTTTGAAG GCAAACACCC CTAGAACATG 2760 

45 

ATATTCCCAT CTAGTGCATT TAAATAGAAA TCACTGAGTT TGCTGCTTTT TTATTGTCAG 2820 

CAGATAGGAG AATTAATAAT GCATTTTAGC TGTGATGTCC ATTTTTATGA AATTCCTAC? 2880 

50 AAGAGCTATG TTAAAAGTAA AGGATGGTGG TGGTTGTATT AACTATATAC CTGTTTAGGC 2940 

CATTCTGGCT GTGGTATTTT TCAATAGGTC AGCATCTGTA AATCTGTCAG TTTTATACAG 3000 

GAGTGCAGAG TGAACTAGGC AACTAGATTA AGAGGTCTAA ATATGAAATA CCAGTTGAGG 3060 

55 

GTGAGGACCT CTTCGTCTTC CTTTAAATGT CTTTTGCCTA GGGAGTGTTT ACCATTTGTG 312 0 

AGGCAGCTTT GTCTGCTCTT ACACTGTACA TCCTATTACT CCATTGGGAA GTAGGTTCAC 3180 

60 TTTCCTCTO3 CCTTTTGCCT AAGTTAGGCT TTGCTGAATC AAGCCTACTT TTCCTTTTAG 3240 
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AAAAGGTTGT TACAGGAGAT TTACTGGCAA CTGTTCTTTT CCCATCAAAA ATCAGTGAAT 33 30 

GTTTGCTGAG TATAAATGCT GCTTCCTTAA ACCACTTGTC GCTTTAGGAT CAACTTTACC 3 360 

5 

TGTACCTTTT CTCCTTTCCT CCCTTGCCAC CTCAGGTGCA AATCTGAACT CAGTGTCTGC 3420 

TTCTTCCATT TTCTCGTCTC TCTCCCCTCT TCCCCCATTA TCCATATGAC ATTATTTTAC 3480 

10 TTCAAATGAC AGCATCAATC TTAAAAAGAT ATACATTAAA ACTAAGGAGT TTTTTTAAAG 3 54C 

AAAGCCTGAA TAAGTTCCTT TCCCTGGTAA CTTTGAAAAG CAGTCAGAGT TGCTATATAG 3 600 

ATATATGTGG CTC CTTT AAA ATGCTTTGTG TATGTGTGGT GTTTAAAAAA AAAAAAAAAA 3 660 

15 

TTCGGGGGGG GGCCCGGTNC CCAT 3684 



20 

(2) INFORMATION FOR SEQ ID NO: 116: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 1965 base pairs 
25 (B) TYPE: nucleic acid 

(C) STRANDEDNESS : double 

(D) TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 116: 

30 

AAGAAAGGGT ATTAAAATTC TAGATCACAT ATGGACCCGG GAAGGTTTTT NACCCTCTGT 60 

TAGTGACATC GAGTCTCCCA CTAGACAAAA TAGGTGGAAA AATCTCTCGA GGGCTCACAT 120 

35 TGTTTTGTCA TCTTCAGGAA AAACACCACC AGGCCATACC ACAGCCTGCC CAGTGAGGCG 180 

GTCTTTGCCA ACAGCACCGG GATGCTGGTG GTGGCCTTTG GGCTGCTGGT GCTCTACATC 24 0 

CTTCTGGCTT CATCTTGGAA GCGCCCAGAG CCGGGGATCC TGACCGACAG ACAGCCCCTG 300 

40 

CTGCATGATG GGGAGTGAAG CAGCAGGAAG GGGCTCCCAA GAGCTCCTGG TGGTGCAGCC 360 

TGTGCTCCCC TCAGAAGCTC TC<:TCTTCCC AGGGCTCCCG GCTO^TTTCA GCAGGCGACT 420 

45 TTCTTCCAAT GCTGGGCCCA GACTTCTTGC CTGGGTGCTG GCCTGCCCTC TCCGGNCCGC 480 

TTGCTGCCTG TCTGCTTTCC TTGGTGGYTT TGCTrGGGTGC TGGGCCTGCC CTCTCCGGCC 540 

GCTTGCTGCC TGTCTGCTTT CCTTGGTGGC TTTGCTGGGT GCTGGGCCTG CCTTCTCTGG 600 

50 

CTGCTTGCTG CCTGTCTGCT TTCCTTGGTG GCTTTGGCTT CTGCACTCCT TGGCGTCASC 660 

TCTCAGGTCC TCCATTCACA CGAGGTCCTC CTCGCTCTGG CCGCTCTTGC TGCTCCTGTC 720 

55 TGAAGAWATC AGACTGATTT CCTCTTAAGA CTCCTAGGGA TGTGGTGAAG AGCTGGGACT 780 

CAAGTGCAGT CCACGGTGTG AAACATGAGG GARGTGAGGT GTCCGTCCAC TTCCCCCATA 840 

AAGGTGTGCA TTTCAGTTAG GCTGCCCCGC CACAGAGCAG CCTTCATCTG CTCTGCCATC 900 

60 
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cagccccatc tggat 3tg ag gtggggtgga gacatcatgg ggtgattgca gaaa 3gggga 960 

gto3cggccc acgca3cttc tgctgaggag ctgaccgctc tgagctgttc tgtttcgtat 1020 

5 tg:tgctctg tgtctgcatg tattgtgacc gtgcggctcc acctcttc:a gctgctgcta 1080 

cagctgaggc ctggatcgcg gcctttgc rr gtgacttacg tgtctgtcac c3gcangcag ii4C 

CCCTACAAAT 3CTGGTGACC 7GCTCTCC :A AGAACAGAGC CTGTCCC2AG ATGTCCCAGT 1200 

10 

AGCGATGAGT AACAGAGGTG GCTGTGGArT TCCTCTACTT CTCCTTGC TG GATCAGGGCC 126C 

TTGCTGCCT 2 2 3 GOT 3GGCA G3TCTG3C3T TGCTCTCTTG GCAGGGCTX A3CCCCTCTG 1320 

15 AC 2ACTCTGC AGCTCACCAT G3AGCT3AT3 CCAAAGTTGT GGTGTCCAGT 3T03AGCAGC 1380 

CCTGGGAGCC ACTGCCACCT T : 3AGAGGG3T TCCTTGCTGA GACCCACATT G3TTGACCT3 1440 

GCCCCACCAT G3CTGCTTGC CTGGCCCAAG CTAGCGTTCT GTGCCATOCT AGAG3TTGAG 1500 

20 

CTGTTGCTCT TCTTCAGGGG A3GAAATAG3 GTGGAGAGCG GGAAGGGT 3T TGCTCCTAAG 1560 

TGTTGCTGCT GTGGCTTTTT TGCCTTCTCC AAAGACGCAC TGCCAGGTCC CAA3CTTCAG 1620 

25 ACTGCTGT3C TTAGTAAGCA AGTGAGAA 2*2 CTGGGGTTTG GAGCCCACCT ACTi3TCTGGC 1680 

AGCATCAG3A TCCTA3TCCT G3CAACATCA GGCCAACGTC CACCCCAGCC TCACATTGCC 1740 

AGAT'3TTG3C AGAAGG03TA ATATTGACCG TCTTGACTGG CTGGAGCCTT CAAAGCCACT 1800 

30 

GGGATGTC3T CCAGGCACCT GGGTCCCATG ACCAGCTCCC CGTCTCCATA G3GGTAGGCA 1860 

TTTCACTGGT TTATGAAGCT CGAGTTTC AT TAAATATGTT AAGAATCAAA GCTGTCTTTG 1920 

35 TTCAGGCTGC TATAACAAAA ATATAATAG3 CTGGGTGGCT TAAAC 1965 

40 (2) INFORMATION FOR SEQ ID NO: 117: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 503 base pairs 

(B) TYPE: nucleic acid 
45 (C) STRANCEDNESS : double 

(D) TOPOLOGY : linear 

(xi) SEQUENCE DESCRIPTION : SEQ ID NO: 117: 

50 AGTGATCCCC TTGCCTCGGC CTCCCAAAAT GCTGGAATTG TAAGCGTGGG CCTCTGCACC 60 

CGGCCTGGTC CGCAATTTAA AAACGCACAG CCACCATTCC CTYTCCAGAA AGCACCCAGA 120 

TGCCTTTGGG AGAACCAGCC TCCTCCATGG AGGAAAGCTT GGGATCTGCC TTCCCACCTG 180 

55 

GGGAGGAGAG GGATCTGTGG AAAATCCTTC TGACGGACTT CCCCTCAGTG CCTGATCCAT 240 

ACTCAATAGT AGAAAAAGTA AGAAATATAC AAAGATAGCA GATACAC3GA GACAGTTCCC 300 

60 CAAATAGCTG AGCGAV/TAGC GCAGAAGCAA TATTGAAGAC CTAATAGCTG AGACATTTCC 360 
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5 



10 



A3AACTGATA AAGTGCATCC AGCCACAGAT CAAGCA3CCC AGAAAATTC C AGGCAGCATC 420 

AA.CAAATAAA TAGCCCCACA TGCACCCGTG AAAATGCAGA AGACCAAACA AAAAAGTCCG 48C 

GTCAACAGCC AGAGTTAAAG AGG 503 

(2) INFORMATION FOR 3EQ ID NO : 118: 



(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 1133 base pairs 
15 (3) TYPE: nucleic acid 

(C) STRANDEDNESS : double 

(D) TOPOLOGY : linear 



20 



(xi) SEQUENCE DESCRIPTION : SEQ ID NO : 118: 

GGCACAGCTT GGAATGAACC CCTGTGGATA AGGGGGACTA TTAGATAGAA TAAACATCAA 60 

TAAATGCTTG ATGAATAAAC GCTAATCCTA CCTTCCCA3C CTGACACCTC CCAGTGGACA 120 

25 CCACACTTCA CTTGAAGCCT TAGAAACCTT TCCCACCCAT GCTTCCAGCC CTGGCTTCAT 180 

GTTGCCATTT CTCACCCCCA GAACAGGCCG CCCGCCTGAA GAAACTACAA GAGCAAGAGA 240 

AACAACAGAA AGTGGAGTTT CGTAAAAGGA TGGAGAAGGA GGTGTCAGAT TTCATTCAAG 300 

30 

ACAGTGGGCA GATCAAGAAA AAGTTTCAGC CAATGAACAA GATCGAGAGG AGCATACTAC 360 

ATGATGTGGT GGAAGTGGCT GGCCTGACAT CCTTCTCCTT TGGGGAAGAT GATGACTGTC 420 

35 GCTATGTCAT GATCTTCAAA AAGGAGTTTG CACCCTCAGA TGAAGAGCTA GACTCTTACC 480 

GTCGTGGAGA GGAATGGGAC CCCCAGAAGG CTGAGGAGAA GCGGAACNTG AAGGAGCTGG 540 

CCCAGAGGCA ANGAGGAGGA GGCAGCCCAG CAGGGGCCTG TGGTGGTGAG CCCTGCCAGC 600 

40 

GACTACAAGG ACAAGTACAG CCACCTCATC GGCAAGGGAG CAGCCAAAGA CGCAGCCCAC 660 

ATGCTACAGG CCAATAAGAC CTACGGCTGT KTGCCCGTGG CCAATAAGAG GGACAC AC GC 720 

45 TCCATTGAAG AGGCTATGAA TGAGATCAGA GCCAAGAAGC GTCTGCGGCA GAGTGGGGAA 780 

GAGTTGCCGC CAACCTCCTA GGCGCCCCGC CCAGCTCCCT TTGACCCCTG GGGCAGGGCA 840 

GGGGGCAGGG AGAGACAAGG CTGCTGCTAT TAGAGCCCAT CCTGGAGCCC CACCTCTGAA 900 

50 

CCACCTCCTA CCAGCTGTCC CTCAGGCTGG GGGAAAACAG GTGTTTGATT TGTCACCGTT 960 

GGAGCTTGGA TATGTGCGTG GCATGTGTGT GTGTGTGTGA GAGTGTGAA.T GCACAGGTGG 1020 

55 GTATTTAATC TGTATTATTC CCCGTTCTTG GAATTTTCTT CCCATGGGGC TGGGGTACTT 1080 

TACATTCAAT AAATACTGTT TAACCCAAAA AAAAAAAAAA AAAAGAAAGA AGN 113 3 

60 
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(2 J INFORMATION FOR SEQ ID ND: 119: 

( i) SEQUENCE CHARACTERISTICS: 
5 (A) LENGTH: 11 CI base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: double 

(D) TOPOLOGY: linear 

10 (xi) SEQUENCE DESCRIPTION: SEQ ID NO: 119: 

GGGCACAGCT GAAGCTGCAG ACCTCCCCAG GGGATGGCTC CTCTCCCCCA 3GAGCCCCGA 60 

GGCAGGGGA3 GCAGAAAGCC TGGGCTCTGG GGGGTGGCCT GCGGACAGCT GTGCTGTGGG 120 

15 

CCGGGGGCTG GGCCTGTCCC ACAGGGNCGT GGAGCTCGTG GTTCTGAGCA GCCAGCTGGG 180 

TGGTGTCTGG GGATAGCT GG GAGGCACAGC GGCTGCCATG TGGGACTGGG ACTGGAGTGC 240 

20 TCCCTGGTCT TGGCCTCTGT GGCTCAGCCT TGCTCTGGTC TGCCTGAGTG CAGGGGCCAA 300 

GGGGCACAGG GCCAGTGAGG CCGGCCACGC TCGGGCCCTC ACCTGTGAGA TGGGGTCGGA 360 

ATTTKACACA GCCTANGGCT TGGTTCTTGG TKGTNGAMCG TGGACTYCTK AGAACGGGAG 42 0 

25 

TGCTGGTCCT GAAAGGCGTG GTTGGAGACC AGCTGCTTTT CTCGCTGTTT TTCTCTTAGG 480 

AGATTAAACA AAAACAGAAA GCACAAGACG AACTCAGTAG CAGACCCCAG ACTCTCCCCT 540 

30 TGCCAGACGT GGTTCCAGAC GGGGAGACGC ACCTCGTCCA GAACGGGATT CAGCTGCTCA 600 

ACGGGCATGC GCCGGGGGCC GTCCCAAACC TCGCAGGGCT CCAGCAGGCC AACCGGCACC 660 

ACGGACTCCT GGGTGGCGCC CTGGCGAACT TGTTTGTGAT AGTTGGGTTT GCAGCCTTTG 72 0 

35 

CTTACACGGT CAAGTACGTG CTGAGGAGCA TCGCGCAGGA GTGAGGCCCA GGC GCCGAGA 780 

CCCAAGGCGC CACTGAGGGC ACCGCGCACC AGAGCGTGAC CTCGGCAGGC TGGACACACT 840 

40 GCCCAGCACA GGCAGACCCA CCAGGCTCCT AGGTTTAGCT TTTAAAAACC TGAAAGGGGA 900 

AGCAAAAACC AAAATGTGTG ACTGGGCTTT GG AGGAGACT GGAGCCTCAG CCCTGTCCTG 960 

GCCACGGGCC GCTGGGGCTG GTGTGGGTGG GCCTTGTGTG CTGGATTTGT AGCTTATCTT 1020 

45 

CCGTGTTGTC TTTGGACCTG TTTTAGTAAA CCCGTTTTTC ATTTTAAAAA AAAAAAAAAA 1080 

AAACTTTGGG GGGGGGCCCC N 1101 

50 



(2) INFORMATION FOR SEQ ID NO: 120: 

55 (i) SEQUENCE CHARACTERISTICS: 

{A) LENGTH : 2 82 base pairs 
(3) TYPE: nucleic acid 

(C) STRANDEDNESS: double 

(D) TOPOLOGY : linear 



60 
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(xi) SEQUENCE INSCRIPTION: SEQ ID NO: 12C : 

AGCTTCTCTG 7CCAGTCTTG AACTCTGGGS TCTCTTGGAA CTTTCCTCAC CCCTCTCAGC 60 

5 CTGAATATT C CTTCCATGGA TTCCACTCAA CCAGACTTTG GATCTGTGCC TACTTAATCA 120 

AC CTr A r rCTT T3CAATATGT TC3GGCCCAC CTTCCACTCC TTGGTTCTTG TTCCTCCTTG 13 3 

GCCTAACTTG TCCCTTCTCC ACTTCACATC CCCGGTGGGA CAGCATTCCT CCTTCCTCCC 240 

0 

AACCTCCCTC CGTCTCARAA AAAAAAAAAA AAAAAAAAAA TT 232 



20 



25 



15 

(2) INFORMATION FOR SEQ 10 NO: 121: 

(i) SEQUENCE CHARACTERISTICS : 

(A) LENGTH: 2635 base pairs 

(B) TYPE : nucleic acid 

(C) ST HANDEDNESS : double 

(D) TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 121: 

TAAGGGGGTG TGTGCTCACC TCCTCCTGAC CCTTAACACT CCTGTCCTGC CCAGACCAAC 6 0 

AGAGAGAGCT GTCCCTGAGA CCCCGGAGAG AAGCAGCTGC CGAAAGCTGC AGCCTTTCCG 120 

30 CACTCTGAGA CCATGATCTT CCTCCTGCCA GGGGAGAGCC ACCCACAGGC CATGTCCAGC 180 

CCCACTTCCC TCAGCCCCCA GGGYTTCCTT CTGGCCCCTC TGAGGATTCC CTAGGGCTGC 240 

CCCGCAGAGG GGYTTCCCCA AGCTCTGTTT TGAAGCCTGC AATGTGGAAA AGTGAGAAGT 300 

35 

CAGAGGGAAC AGGACAGGTG CAGCCGGGCT CTGAGGCCAC ACCTCACACC TCGCTGTTCC 360 

CCAACATCCC CTGAGCAGTG TGAGCTCATC TCACCAGATG AGAAGAGGCC CTGTGCATTT 420 

40 YTTTTGTTTG TTTGTTGCTG TTTTCCCCCA CCCATCCAGT TCTCCTCAGC AAAGCAAATT 480 

CCTTAACACC TTTGGTGGAG AATTTCTTAC CCAGACTTGG GGCTGTGATG CCCTTCAGTG 54 0 

CGTGGTGAGT GCAGCGTGTG TGCGTGTGCC TGTGTGTGAA CCTGGGGGCC ATCCTGGTGG 600 

45 

CCTGGGAGCG TGAGGAGAGG CCCCCTGTGT GCTGGGTGAG TGGTGGGTGT GGGGTCAATG 660 

CAGTGAGGCT CTCTGGGTGA GGCTCCCAAC CTGGCAGTCC CCAGCCTCCC AGCATCTGTG 720 

50 AGCGTCTGTT GGACTTTACA GAAGAGCCTC ATCCYGTCTG CCCCTCACTC TGCCCTGGAA 7 80 

TCAACATCTT CCGAGTCCTT CTTGGGGGAA ATAGCAGAGC CCCACTTAAC TCCATAAACT 840 

^ _ GCTTCCCATT CCGCAGCCCA CTTCTGATTG TTGAGGTGTC GCGTCGTTCC AGGTCCCCCA 900 

GTCCCCTCTT TCTCCTGTCC TCTCTCTGTC CTTCACCTCC CCACTCCAGC CCCGGCTCAG 960 

TTCAGGGAAA TGCTGTTCCA YATCAGCCCT CTGCTCTCTG AGGCAGCCGC GCCTCTGACT 1020 

60 CGGAGCTACT TGAAACTTCT GCTCTTGCTA GGATTGGAGT CTACCTATCT CTTCCATTTG 1080 
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7CCCAGCTGG AGTTCTGGAA CTTTCCTCCT CC-C-GG7 GGGG GTG-GGGGCTG TTAAC-GATGC 114 0 

TGGGGGGCCT GGGGAAGGAA GGAGTTCAGA C<LA-.GGGTGT CCCTTr — ~ ^-~GTC- 3 ~ - Q 

5 

CCCTCCGCTC CTGGGACACG TGCTCTCTCT GTC7CTGGGT CTT?TGGr7G r'^GTTTC 1250 
TGTGTCCTTG TAAATA7GTT TTAGGAAGAA AGC-_=-AA3G3 ACT 3AAC7AG CCCTTGGTAG 1320 
10 GATTGCAGGG GTCCAGCCTT GCCTGTTTCC GAACGCCCCA CACTC-C7TTT CGCCCCACTG 1380 
AGACTGGTCC CCTCAAAAGG TAGACAAAAC AGCAGCTCCC TGTGGAGCTG AAGGGCGGCC 
TCAAAGTGGC TTTTTGTTAG ACAAGGTTAA GGTCTCCTCA TGAGC--^— ^C-G-TCC-G 

15 

tccttcctca gctccttgat ttgtgacctt gaccaagggg cctgccaccc ac-czcctcca 



1440 
1500 



1560 

GTGCCCTCTC CTCGATGCCT CGCTCCTTCC TGCCCCCACT CCCCTCGCTT AGCCAGGTAC- 152 0 

20 GGG AATTAGG GCCATGCTGG AAGAAGCTTA ACCATGTGTT CAAAG AAC GG TTTGTTGC7T 15 SO 

GCTTGGTCCT GGAACTCCCC TTGGCTGCCC CAGGCCTCCT TGGCCCATGG GCGCTGGGG:- 1740 
^ AGGTGGATGT CAGATCTGGT AGGTTGCAGC A.GAGAAAATA AATGTGCCTT GAGAGACCA2 1300 

TCAGAGAGGG TCCAAGGGTG ATGGAGAAGG AAGCAIOGC C TGGGAGCCCG GAAGGGARGG 1860 



1920 
198C 



GTGGTGGGTG GCGGCATCTT GACTGCCCCC TGTTGTCCCA CAC GTGGGGG GTC-GCCACCC 
30 CYCTTCACTC CAGCCCGCCT GCCTTCAGCC CTXTCATGAGC TTCACCTGCC TC ZAAC77TCA 

CTTTGGAGGG GGTGGGGTCC GTTGGCATCA ACACGC-GGAC CCTCTC-GTTG ACCAAAGCCC 2040 
GAGCCCTCAG CCCCTGGGGA GAACAAATGG CTGAGCTTTG AT AC CTGGGG TCGTCGAGAG 2100 

35 

GCTGCGGGCT GGCGGCAGTC CCAGGGGAGA GACACCACAG AAGGAGACCC AG-.CATCCCG 2160 
AGGAAGTTCC CAGCAGAGCA AACTGCTTTC CAGCCTGAAG CCTGCTTA-A CTGTGTGATG 2220 

2280 



40 TGCAATAACT GAGCTTAGAG TTAGGAATTG TGTTCAAGTG CTTGGATTTG CGTCTGTAGA 

TTTAACTGCT GAAATTGTAT CTCTCAGTAA TTTTAGATGT CT1V1AAAAA ATTGAA AAAC 2340 
AAAGTGTTAG ACTGTGTGCG TGTGCGTTGA TGGGCA.CTCA AGAGTCC2 CT GA.GTCATCCA 2400 

45 

GCCCTGCCTT TCCCCTGCGC CCCCATCCTC TCACGTCCCG CCCYGCC7CC ACTTGGGGAC 2460 
CCTGCCTCGT GTCGTCTTTA TCTGCCTATT ACTO.GCCTA AGGAAACAAG TACACTCCAC 2520 
50 ACATGCATAA AGGAAATCAA ATGTTATTTT TAAGAAAATG GAAAATAAAA ACTTTATAAA 2580 
CACCAAAAAA AAAAAAAAAA ACCCNGGGGG GGGGCCGGTA ACCCATTTCG CCTAA 263 5 

55 

(2) INFORMATION FOR SEQ ZD NC : 122: 

(i) SEQUENCE CHARACTERISTICS : 
60 (A) LENGTH : 994 base pairs 



WO 98/54963 



T/LS98/I1422 



10 



40 



45 



50 



55 



(2) INFORMATION FOR SEQ ID NO: 123: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 1542 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : double 

(D) TOPOLOGY: linear 



375 



13) TYPE: nuclei" acid 
<C) STRANDEDNESS : double 
<D) TOPOLOGY: linear 

(XI) SEQUENCE DESCRIPTION: SEQ ID NO: 122: 

GAATTCGGCA GAGGTTCGGC GAAGATAGGG AATAAGGAAG CACAGGAGTA GGGGAGAAGG 6C 

AAGCACAGGA GTAGGGGAGA TAT AC AGO GG TCAGGATAAG GGGGAAAGGG CGGTGGTTGC 12 G 

SCAAGAGGTG AAACAAGATG TGAGAGACAA GGGGTAGGGA AGAAATGGGG CAGCGGTTAG 18C 

GTTCAGAAGC GCATAGACCG TGGCGGACGG GCAATGCGAG GGGCACAGAA AGGAACTGAG 240 

15 GGGTGGGCTA TTTTAARGGA GATGGTCCTT CAGCCCTCTT YTTTTCTGCG TAGTTCTCCT 3 00 

CCTCCAGGCC GCGCGCGGAT ATGTCGTCCG GAAACCAGCC CAGTCTAGGC TGGATGATGA 360 

CCCACCTCCT TCTACGCTGC TCAAAGACTA CCAGAATGTC CCTGGAATTG AGAAGGTTGA 420 

20 

TGATGTCGTG AAAAGACTCT TGTCTTTGGA AATGGCCAAC AAGAAGGAGA TGCTAAAAAT 430 

CAAGCAAGAA CAGTTTATGA AGAAGATTGT TGCAAACCCA GAGGACACCA GATCCCTGGA 540 

25 GGCTCGAATT ATTGCCTTGT CTGTCAAGAT CCGCAGTTAT GAAGAACACT T3GAGAAACA 600 

TCGAAAGGAC AAAGCCCACA AACGCTATCT GCTAATGAGC ATTGACCAGA G3AAAAAGAT 66 0 

GCTCAAAAAC CTCCGTAACA CCAACTATGA TGTCTTTGAG AAGATATGCT GGGGGCTGGG 72 0 

30 

AATTGAGTAC ACCTTCCCCC CTCTGTATTA CCGAAGAGCC CACCGCCGAT TCGTGACCAA 780 

GAAGGCTCTG TGCATTCGGG TTTTCCAGGA GACTCAAAAG CTGAAGAAGC GAAGAAGAGC 84 0 

35 CTTAAAGGCT GCAGCAGCAG CCCAAAAACA AGCAAAGCGG AGGAACCCAG ACAGCCCTGC 900 

CAAAGCCATA CCAAAGACAC TCAAAGACAG CCAATAAATT CTGTTCAATC ATTTAAAAAA 960 

AAAAAAAAAA AAAAAAAAAA AAAAAGGGGA GGGG 994 



(XI) SEQUENCE DESCRIPTION: SEQ ID NO: 12 3: 

GGCASAGCCA CCTCGGCCCC GGGCTCCGAA GCGGCTCGGG GGCGCCCTTT CGGTCAACAT 60 

CGTAGTCCAC CCCCTCCCCA TCCCCAGCCC CCGGGGATTC AGGCTCGCCA GCGCCCAGCC 120 

AGGGAGCCGG CCGGGAAGCG CGATGGGGGC CCCAGCCGCC TCGCTCCTGC TCCTGCTCCT 180 

60 GCTGTTCGCC TGCTGCTGGG CGCCCGGCGG GGCCAACCTC TCCCAGGACG ACA3CCAGCC 240 
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CTGGACATCT GATGAAACAG TGGTGGCTGG TGGCACCGTG GTGCTCAAGT GC CAAGTGAA 3 00 

AGATCACGAG GACTCATCCC TGCAATGGTC TTAACCCTGC TCA3CAGACT CTCTACTTTG 36 0 

5 

GGGAGAAGAG AGCCCTTC3A GATAATCGAA TTCAGCTGGT TAM2TCTACG CGCCACGAGC 420 

TCAGCATCAG CATCAGCAAT GTGGCCCTGG C A 3ACGAGGG C3AGTACACC TGCTCAATCr 480 

10 TCACTATGCC TGTGCGAACT GCCAAGTCCC TC jTCACTGT GCTAGGAATT CCACAGAAG: 540 

CCATCATCAC TGGTTATAAA TCTTCATTAC GG3AAAAAGA CACAGCCACC CTAAACTGTC 600 

AGTCTTCTGG GAGCAAGCCT GCAGCCCGGC TCACCTGGAG AAAGGGTGAC CAAGAACTCC 660 

15 

ACGGAGAACC AACCCGCATA CAGGAAGATC CCAATGGTAA AACCTTCACT GTCAGCAGCT 720 

CG3TGACATT CCAGGTTACC CGGGAGGATG ATGGGGCGAG CATCGTGTGC TCTGTGAACC 780 

20 ATGAATCTCT AAAGGGAGCT GACAGATCCA CCTCTCAACG CATTGAAGTT TTATACACAC 840 

CAACTGCGAT GATTAGGCCA GACCCTCCCC ATCCTCGTGA GG3CC AGAAG CTGTTGCTAC 900 

ACTGTGAGGG TCGCGGCAAT CCAGTCCCCC AGCAGTACCT ATGGGAGAAG GAGGGCAGTG 960 

25 

TGCCACCCCT GAAGATGACC CAGGAGAGTG CCCTGATCTT CCCTTTCCTC AACAAGAGTG 1020 

ACAGTGGCAC CTACGGCTGC ACAGCCACCA GCAACATGGG CA3CTACAAG GCCTACTACA 1080 

30 CCCTCAATGT TAATGACCCC AGTCCGGTGC CCTCCTCCTC CAGCACCTAC CACGCCATCA 1140 

TCGGTGGGAT CGTGGCTTTC ATTGTCTTCC TGCTGCTCAT CATGCTCATC TTCCTTGGCC 1200 

ACTACTTGAT CCGGCACAAA GGAACCTACC TGACACATGA GGCAAAAGGC TCCGACGATG 1260 

35 

CTCCAGACGC GGACACGGCC ATCATCAATG CAGAAGGCGG GCAGTCAGGA GGGGACGACA 1320 

AGAAGGAATA TTTCATCTAG AGGCGCCTGC CCACTTCCTG CGCCCCCCAG GGCCCTGTGG 1380 

40 GGACTTGCTG GGGCCGTCAC CAACCCGGAC TTGTACAGAG CAACCGCAGG GGCCGSCCCT 1440 

CCCGNTTGTT CCCCAGCCCA CCCACCCCCT TGTTACAGAA TGTYTKGTTT GGGGTGCGGT 1500 

TTTGTWATTG GTTTNGGATN GGGGAAGGGA GGGANGGCGG GG 1542 

45 



50 



55 



60 



(2) INFORMATION FOR SEQ ID NO: 124: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 1390 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : double 

(D) TOPOLOGY : linear 

[Xi) SEQUENCE DESCRIPTION: SEQ ID NO: 124: 
CAAGCTCTAA TACGACTCAC TATAGGGAAA GCTGGTACGC CTGCAGGTAC C3GTCCGGAA 60 



WO 98/54963 



T/US98/1 1422 



10 



TTCCCGGGTC CACCCACGCG TCGGGGCCTC AG3GTGGAC3 CATGoTTCTG CZ.CZXAGZCZ 12 C 

CTCGTCATGG TG3CGCCTGT GTQjT ACTT 3 GTAGCGGCGG CTCTGCTAGT CGGCTTTATC ISC 

CTCTTCCTGA CTCGrAGCCG GGGCCGGGC3 GCATCAGCC3 :3CCAAGAG;C ACT3CACA.iT 24C 

GAGGAGCTGG CA3GAGCAGG CCGGGTGGCC CA UCCTGGGC CCCTGGAG3C TGA3GAG:CG 300 

AGAGCTGGAG GCAGGCCTCG GCGCC'jGAGj GAGCTGGGCA GCCGCCTA^ GGCZCAGCGT 360 

CGAGCCCAGC GGGTGGCCTG GGCAGAAGCA GATGAGAACG AG3AGGAA3C TGTCATCCTA 420 

GCCCAGGAGG AGGAAGGTGT CGAGAAGCC A GCGGAAAYTC ACCTGTCG3G GAAAATTGGA 480 

15 G3TAAGAAAC TGCGGAANNT GGAGGAGAAA CAAGCGCGAA AGGCCCAGCK TGA<3GCAGAG 54 0 

GAGGCTGAAC GTGARGWGCG GAAACGACTC GAGTCCCAGC GCGAATGAGT G3AAGAAGGA 6C0 

GGAGGAGCGG CTTCGCCTGG AGGAGGAGCA GAAGGAGGAG GAGGAGAG3A AGGCCCGCGA 660 

20 

GGAGCAGGCC CAGCGGGAGC ATGAGGAGTA CCTGAAACTG AAGGAGGCCT TTGTGGTGGA 720 

GGAGGAAGGC GTAGGAGAGA CCATGACTGA GGAACAGTCC CAGAGCTTCC TGACAGAGTT 780 

25 CATCAACTAC ATCAAGCAGT CCAAGGTTGT GCTCTTGGAA GACCTGGCTT CCCAGGTGGG 840 

CCTACGCACT CAGGACACCA TAAATCGCAT CCAGGACCTG CTGGCTGAGG GGACTATAAC 900 

AGGTGTGATT GACGACCGGG GCAAGTTCAT CTACATAACC CCAGAGGAAC TGGCCGCC3T 960 

30 

GGCCAACTTC ATCCGACAGC GGGGCCGGGT GTCCATCGCC GAGCTTGCCC AAGCCAGCAA 102 0 

CTCCCTCATC GCCTGGGGCC GGGAGTCCCC TGCCCAAGCC CCAGCCTGAC CCCAGTCCTT 1080 

35 CCCTCTTGGA CTCAGAGTTG GTGTGGCCTA CCTGGCTATA CATCTTCATC CCTCCCCACC 114 0 

ATCCTGGGGA AGTGATGGTG TGGQCAGGCA GTTATAGATT AAAGGCCTGT GAGTACTGCT 1200 

GAGCTTGGTG TGGCTTGGTG TGGCAGAAGG CCTGGCCTAG GATCCTAGAT AAGCAGGTGA 1260 

40 

AATTTAGGCT TCAGAATATA TCCGAGAGGT GGGGAGGCTC CCTTGGAAGC TGGTGAAGTC 1320 

CTGTTCTTAT TATGAATCCA TTCATTCAAG AAAATAGCCT GTTGCAAAAA AAAAAAAAAA 1380 

45 AAAAACTCCA 1390 

50 {2} INFORMATION FOR SEQ ID NO: 12 5: 

(i) SEQUENCE CHARACTERISTICS : 

(A) LENGTH: 1288 base pairs 
(35 TYPE: nucleic acid 

(C) STRANDEDNESS: double 

(D) TOPOLOGY: linear 



55 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 125: 
60 GGCGCGCGGG TGAAAGGCGC ATTGATGCAG CCTGCGGCGG CCTCGGAGCG CGGCGGASCA 



6C 



WO 98/54963 




GACGCTGACC ACGTTCCTCT CCTCGGTCTC CTCCGCCTCC AGCTCCGCGC TGCCCGGCAG 120 



15 



25 



35 



45 



(2) INFORMATION FOR SEQ ID NO: 126: 



180 
24C 



CCGGGAGCCA TGCGACCCCA GOGCCCCGCC GCCTCCCCGC AGCGGCTCCG CGGCCTCC7G 

5 

CTGCTCCTGC TGCTGCAGCT GCCCGCGCCG TCGA3CGCCT CTGAGATCCC CAAGGGGAAG 
CAAAAGGCGC ATCCGGCA3A GGGAGGTGGT GGACCTGTAT AATGGAATGT GCTTACAAG3 300 
10 GCCAGCAGGA GTGCCTGGTC GAGACGGGAG CCCTGGGGCC AATGGCATTC CGGGTACACC 350 
TGGGATCC C A GGT^GGGATG GATTCAAAGG AGAAAAGGGG GAATGTCTGA GGGAAAGCTT 
TGAGGAGTCC TGGACACCCA ACTACAAGCA GTGTTCATGG AGTTCATTGA ATTATGGCAT 



420 
480 



AGATCTTGGG AAAATTGCGG AGTGTACATT TAGAAAGATG CGTTCAAATA GTGCTCTAAG 540 



600 



AGTTTTGTTC AGTGGCTCAC TTCGGCTAAA ATGCAGAAAT GCATGCTGTC AGCGTTGGTA 

20 TTTCACATTC AATGGAGCTG AATGTTCAGG ACCTCTTCCC ATTGAAGCTA TAATTTATTT 660 

GGACCAAGGA AGCCCTGAAA TGAATTCAAC AATTAATATT CATCGCACTT CTTCTGTGGA 720 

AGGACTTTGT GAAGGAATTG GTGCTGGATT AGTGGATGTT GCTATCTGGG ITGGCACTTG 780 

TTCAGATTAC CCAAAAGGAG ATGCTTCTAC TGGATGGAAT TCAGTTTCTC GCATCATTAT 840 

TGAAGAACTA C CAAAAT AAA TGCTTTAATT TTCATTTGCT ACCTCTTTTT TTATTATGCC 900 

30 TTGGAATGGT TCACTTAAAT GACATTTTAA ATAAGTTTAT GTATACATCT GAATGAAAAG 960 

CAAAGCTAAA T ATGTTT AC A GACCAAAGTG TGATTTCACA TGTTTTTAAA TCTAGCATTA 1020 

TTCATTTTGC TTCAATCAAA AGTGGTTTCA ATATTTTTTT TAGTTGGTTA GAATACTTTC 1080 

TTCATAGTCA CATTCTCTCA ACCTATAATT TGGGAATATT GTTGTGGTCT TTTGTTTTTT 1140 

CTCTTAGTAT AGCATTTTTA AAAAAATATA AAAGCTACCA ATCTTTGTAC AATTTGTAAA 1200 

40 TGTTAAGAAT TTTTTTTATA TCTGTTAAAT AAAAATTATT TCCMACAACC TTAAAAAAAA 1260 

AAAAAAAAAA AAAAAAAAAA AAAAANAA 1288 



(i) SEQUENCE CHARACTERISTICS : 
50 (A) LENGTH: 1517 base pairs 

(B) TYPE: nucleic acid 

(C) STRANEEDNESS : double 

(D) TOPOLOGY: linear 

55 (xi) SEQUENCE DESCRIPTION : SEQ ID NO: 126: 

AGTGGCTTAA AGGCATCGTT TTAGGGATTA CTOGGAAGTA TCTTCAAAGT AAT AC ATGAG 60 
AAACATTCCT TCCTAAATCC TTTATTATAT TGAATATCGT ATTAATTGGT TTTCAGAGGT 120 

60 
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TAAATTAACC ATGTATTCCT GCAATAAATG TCACTTGTNT 


CTTGTATATA ATCTTTTTTA 


130 




7 AT ATT AC CO GATTGATTCA TTAGTATTTT GTTGAGGATT 


TTTGTGTCTA TATTCATAAG 


240 


5 


AGATOCTOGT CTGCAGTTTT CTTTTTTTGT GATAATCTGG 


TTTTTGTATC AGTAATACAG 


3 00 




GCCCCATGAA ACGAGTTGGG AAGTGTTCAC CTCTCTTGTA 


TTTTTTCAAG AGTTTGTGAA 


360 


10 


GAATTGCTAT TAATTCTTTA AATGTTTGGT AG AATCT AC C 
GCTTTTTTTT GAGGGAAGTG TTCTGATAAC TAATTCAGTA 


ATTGAAATCA TGTGTCCTGG 
TCTACTTTTT ATAGCTCTGT 


423 
480 




TCAGATTTTG CTTCTTCCTG AGTTAGTTTT GGTAATTTGT 


GTATCTCTAG GARTTTGTCC 


540 


15 


ATTTCATTTA TCTCATTTGT TGGCATAAAT TAAACTAAAT 


TTGGCCTGAG CCTACCTGTA 


600 




TATCTTGAGT C CCTCTGT AA GGAACTGTAG CCTAACTTGT 


ACATAAACAA ACTGAAATCC 


660 


20 


TAAATTAGGA ATGTAGTTTT TGTAACAGCT CCTGAGTCTC 
TPTCTrAATT C^CAGGCTGCT AACTAAGCAG CCCATGSTCA 


AGGCAGTCAC AGCAGYCAAG 
AATGAGGCAA AAACCTTTGC 


720 
780 




TTTTiarara tahtatagct TTGTAATCCT TTTCTTGCAC 


ACTCGGGTAA TTTCTTCCTT 


840 


25 


TTTCATTCCC KGWATTTTCC AKGAATATGA RTCTYCCTTT 


TTTCCCCTCC TGTCAGTCTA 


900 




GCTAATGGTT TGTCAATTTT GTTGATCTTT TGAARAACAA 


ACCTTTGGTT CC ACTTTCTT 


960 


30 


GTTGCATATG CTGARTATTC T CAT AATTGG AGTGGAAAGC 
TACTTAGGGC TGAGGAGTTC ATGGACTTCG CAAAACCTCC 


TGATCTTTGA TTACTTATTT 
TTGAATCTAA ATTGCATCTT 


1020 
1080 




CTTTCCTGGT TTCTGGGCTG AAACATGTTT TTTCCCATCT 


WANAWACCCT TGGTCTTTTC 


1140 


35 


ATKGGCGATT AAGACTAGAG AAAGTTCTAG ATMCCTTGTC 


CTTTTATGCT GTGATTTTGT 


1200 




TTAAAGGCTT TCTATGTAGT AAAACTATCT ATATAGACAA AATAGAGCCT TGAGTTGTGG 


1260 


40 


TCTTGAATTT GATCAACATG ATTTACCACA TTCTGTACTG GATATTTCTT CACCTGCTGC 
TACTGTAAAC CATTTTATTC TTGGATCTTC TGTAGAGTAT ATTATCACAG GTACTTTTTA 


1320 
1380 




CAGGGGTGTC TAATCTTTTG GCTTCCCTGG GCACATTGAA 


AGAAGAAGAA TTGTCTTGGG 


1440 


45 


CCACACATCA AATACGCTAA CACTAATAAT AGTTGATGAG 
GCAAAAAAGN CCCAAAA 


CTAAAAAAAA AAAAAAAAAG 


1500 
1517 


50 


(2) I NFORMAT I ON FOR SEQ ID NO: 127: 






55 


(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 1073 base pairs 

(B) TYPE : nucleic acid 

(C) STRANDEDNESS : double 

(D) TOPOLOGY: linear 







60 (xi) SEQUENCE DESCRIPTION: SEQ ID NO: 127: 




TTCTGCAT^ TGAAATAGAT 7GG7TTGGAA AATGAACCTG GCTTTGCTAT AAATTACATT 120 

CA^GXTTr TTTGCAAATG TGTAACTTGC CTATCAAAGT AGTTTGTAGG GCAAATGCAG 180 

AA7A7ATGTC TCCATCTGGT AAAGTACCTT WT AYTC ATGT GGGAAATCAA GTAGTATCAG 240 

AACTTGGTCC AATAGTCCAA TTTGTTAAAG CCAAGGGCCA TTCTCTTAGT GATGGGCTGG 300 

AGGAAGTCCA AAAAGCAGAA ATGAAAGCTT ACATGGAATT AGTCAACAAT ATGCTGTTGA 360 

CTGCAGAGCT GTATCTTCAG TGGTGTGATG AAGCTACAGT AGGGRKGATC ACTCATGMTA 420 

GGTATGGWTC TCCTTACCCT TGGCCTCTGW WTCATATTTT GGCCTATCAA AAACAGTGGG 480 

AAGTCAAACG TAAGNTGAAA GCTATTGGAT GGGGAAAGAA GACTCTGGAC CAGGTCTTAG 540 

AGGATGTAGA CCAGTGCTGT CAAGCTCTCT CTCAAAGACT GGGAACAGAA CCGTATTTCT 600 

TCAATAAGCA GCCTACTGAA CTTGACGCAC TGGTATTTGG CCATCTATAC ACCATTCTTA 660 

CCACACAATT GACAAATGAT GAACTTTCTG AGAAGGTGAA AAACTATAGC AACCTCCTTG 720 

CTTTCTGTAG GAGAATTGAA CAGCACTATT TTGAAGATCG TGGTAAAGGC AGGCTGTCAT 780 

AGAGTTATGT GTTAGTCTCA GGAGTCTTAA CTTTTGAAAT ATGTTTTACT TGAATGTTAC 840 

A1TAGATATT GGTGTCAGAA TTTTAAAACC AAATTACTGC TTTTTGAAAC CTCAAATTAT 900 

ATAATGTATC TTATGTATGT GCTTTATATT GTTATTTGTG TATACATTAA AATAATTCTG 960 

AATTATTTAA TCTGATATGT TGTATTCTGT ATCTTGAAAT TTTTGTTTCC TTGAAACATG 1020 

CATGCATTTA AAAATAAAGC TTAAACAACT GTAAAAAAAA AAAAAAAAAA CTC 1073 



(2) INFORMATION FOR SEQ ID NO: 128: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH : 300 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: double 

(D) TOPOLOGY: linear 

<xi) SEQUENCE DESCRIPTION: SEQ ID NO: 128: 

CAACCCCTGC CTTTTTTTTG TTTTCCATTT GCTTGGTAGA TCTTCCTCCA TCCCTTTATT 60 

TTGAGCCTAT GTGTGTCTCT GCCCGTGAGA TGAGTCTCCT GAATACAGCA CACTTACTGG 120 

TCTTGACTCT GTATCCAATT TGCCAGTCTG TGTCTTTCAT TTGGAGCATT TAGCCCATTT 180 

ACATTTAAGG TKAATATTGT TATGTGTGAA TTTRATCYTR TCATTATGWT GTTAGCTGGT 240 

TATTTTGCTT GTTAGTTGAT GCAGTTTCTT CCNGGCATCA ATGGTCTTTA CAANTTGGCA 30C 
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(2) INFORMATION FOR SEQ ID NO: 129: 

5 

ii) SEQUENCE CHARACTERISTICS: 

(A) LENGTH : 12 7 5 case pairs 

(B) TYPE : nucleic acid 

(C) STRANDEDNESS : double 
10 (D) TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTION : SEQ ID NO: 129: 

GGCAGAGCCT GTCCCTGCTG CCCCTGCAAA AAAAACCCCC TCTGGTGTGA GCAGGATGGT 60 

15 

TGGAGGTTAT GTGAGCTCCT TCTCCTTTCC TCCAGTTTCC TCTTCCCTTC TCCTCCCTGC 120 

CTCTTTTGCT TTTCCCTTTC TTCCTGGTAC CCCCTGCCCA TTCCTGTATT TTCTCCCATC 180 

20 GCCATTCTCC CCTCTCCCAC TGTCCCTAAC CCGTTCAAAC TCTTTCCTCT TAAATGGTTG 240 

AGATTTTCTC TCACCAAGCA CACCCCAGTA TTAATTAAAC TAGCTGCAAA CAGGCAGCAA 300 

GTGGTCTACC ATGACAGATG C^TTTTGTGT GTGTGTGTGT GTGTGTAATT GTAATAAAAC 360 

25 

ATATTGARTC ACTCAATAAA CACAGAGTGT CTACTACATG TATCARGCAC TATCATAGAT 420 

GCTAATTAAC GAAACTGAAA TGGCCAGGCC CTCACAGTGG CTCATGCCTA TAATCCCAGC 480 

30 ACTTTGGGAG GATGAGGCAG GAGGATCACT TGAGGCCGGG AGTTCAAGAC CAGCCTGGGC 540 

AACATAGTAA GACTCCATCT CTACAAAAAA AAAATTTTTT TT ATT AT ACT TTAAGTTTTG 600 

GGTTACATGT GCAGAACGTG TAGTTTTGTT ACATAGGTAT ATACGTGCCC TGGTAGTTTG 660 

CTGCACCCAT CAACCCATCA CCTACATTAG GTATTTCTCC TAATGTTACC CCTCTCCTAG 720 

CCCCCCACCC CGTGACAGGC CCTGGTGTGT GATGTTCCCC TCCCTGTGTC CATGTGTTCT 780 

40 CATTGGTCAA CTCTCACCTA TGGAGTGAGA ACATGTGGTA TTTGGTTTTC TGATCTTGTG 840 

ATAGCTTGCT GAGAATGTKG GTTTCCAGCT TTATCCACGT CCCTGCAAAG GGCATAAACT 900 

CATCCCTTTT TATGGCTGCA TAGTGTTCCA TGGTGTATAC GTGCCACATT TTCTTAATCT 960 

ATCATTGATG GACAAGTTTT GCTATTGTGA ATAGTGCCAC AATAAACATA CGTGTGCGTG 1020 

TGTCTTTATA GCAGCATGAT TTATAATCCT TTGGGTATAT ACCCAGTAAT GGGATCACTG 1080 

50 AGTCAAATGG TATTTCTCGT TCTAGATCCG TAAGGAATTG CCACACTGTC TTCCACAATG 1140 

TTTGAACTAA TNTACACTCC CACCAACAGT GTAAAAGTGT TTCTATTTTT CCACAACCTC 1200 

TCCAACATCT GTTATTTCCT GACTTTTTAA TGAACGTCAT TCTAACTGGC GTGAGATGGT 1260 

ATCTCATTGT GGTTT 1275 



35 



45 



55 



60 
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10 



20 



45 



[2) INFORMATION FOR SEQ ID NO: 130: 

(l) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 472 base pairs 

(B) TYPE : nucleic acid 

(C) STRANDEDNESS : double 

(D) TC PC LOGY: linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 130: 

CNGAAACCCC GTGAACCCTC CCCGGGTTAA AAAGC CCCCC CTAAATGGGG GGAACGCYTC 60 

ACACGTTATA AAAAAGCACT AGAATGTTTT G AAAGC GAGA AACAACAGCT GTGTAGGGTA 12 0 

15 GCTAGCAGTT AGTGTTGTAC AGAAGACAGA TATTTGTGCA TTTYTGCATT TTCTAAGTTT 130 

GCTGCAATGA GCATGTATTA CTTTCATAGT TATAAAACAC ATGCAAAATG CCCTTTTAAA 24 0 

ATGAAAAAAA ATCCATGAGT GTAAGTGATA TATATGCTTT GG AAAGC CTG GGACGGTCAT 300 

TGTTTACTCT CAATAGTATG TGTTTGCCTT TGTCTTTTTG AGACATTTTG TTTTAATCTG 360 

TTGATGACAA TAACCTGTTG ATAATATAAC TTGATAACAA ATAAAATGAC TTATGATTGA 42 0 

25 AWMAAAAAAA AAAAAAAAAA AAAAAAAAAA AAAAAAAAAA AAAAAAAAAA MN 472 



30 (2) INFORMATION FOR SEQ ID NO: 131: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 1950 base pairs 

(B) TYPE: nucleic acid 
35 (C) STRANDEDNESS: double 

(D) TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 131: 

40 ACCTCTCAGA ATCTTCTCTC AGCAACCTGA GTCTTCGCCG TTCCTCAGAG CGCCTCAGTG 60 

ACACCCCTGG ATCCTTCCAG TCACCTTCCC TGGAAATTCT GCTGTCCAGC TGCTCCCTGT 120 

GCCGTGCCTG TNATTCGCTG GTGTATGATG AGGAAATCAT GGCTGGCTGG GCACCTGATG 180 

ACTCTAACCT CAACACAACC TGCCCCTTCT GCGCCTGCCC CTTTNTGCCC CTGCTCAGTG 240 

TCCAGACCNT TGATTCCCGG CCCAGTGTCC CCAGCCCCAA ATCTGCTGGT GCCAGTGGCA 300 

50 GCAAAGATGC TCCTGTCCCT GGTGGTCCTG GCCCTGTGCT CAGTGACCGA AGCTCTGCCT 360 

TGCTCTGGAT GAGCCCCAGC TCTGCAACGG GCACATGGGG GGAGCCTCCC GGCGGGTTGA. 420 

GAGTGGGGCA TGGGCATACC TGAGCCCCCT GGTGCTGCGT AAGGAGCTGG AGTCGCTGGT 480 

55 

AGAGAACGAG GGCAGTGAGG TGCTGGCGTT GCCTGAACTG CCCTCTGCCC ACCCCATCAT 54C 

CTTCTGGAAC CTTTTGTGGT ATTTCCAACG GCTACGNCTG CCCAGTATTC TACCAGGCCT 6C0 

60 GGTGCTGGCC TCCTGTGATG GGCCTTCGMA CTCCCAGGCC CCATCTCCTT GGCTAACCCC 660 
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TGATCCAGCC TCTGTTCAGG TACGGCTGCT GTGGGATGTA CTGATCCCTG ACCCCAATAG 72 0 

CTGCCCACCT CTCTATGTGC T'GTGGAGGGT CCACAGCCAG ATCCGCCAGC GGGTGGTATG 73 3 

5 

GCCAGGCCCT GTACCTGCAT CGCTTAGTTT GGCACTGTTG GAGTCAGTG: TGCGCCATGT 84 0 

TGGACTCAAT GAAGTGCACA AGGC~GTGGG GCTCCTGCTG GAAACTCTA3 GGCCCCCACC 9CC 

10 CACTGGCCTG CACCTGCAGA GGGGAATCTA 2C 3TG AGATA TTATTC GTG A CAATGGCTGC 96 C 

TCTGGGCAAG GACCACGTGG ACATAGTGGC CTTCGATAAG AAGTACAAGT CTGCCTTTAA 102 0 

CAAGCTGGCC AGCAGCATGG GCAAGGAGGA GCTGAGGCAC CGGCGGGCGC AGATGCCCAC 10S0 

15 

TCCCAAGGCC ATTGACTGCC GAAAATGTTT TGGAGCACCT C C AG AATGCT AGAGACCTTA 114 0 

AGCTTCCCTC TCCAGCCTAG GGTGGGGAAG TGAGGAAGAA GGGATTCTAG AGTTAAACTG 12 0C 

20 CTTCCCTGTT GCCTTCATGG AGTTGGGAAC AGGCTGGGAA GGATGCCCAG TCAAAGGCTC 1260 

CAAGCGAGGA CAACAGGAAG AGGGATCCAC TGTTACCAAA AGTCCTGATT CCCCCATCAC 132 0 

CAACCTACCC AGTTTGTTCG TGCTGATGTT GGGGGAGATC TGGGGGGAGT TGGTACAGCT 138 0 

25 

CTGTTCTTCC CTTGTCCTAT ACCGGGAACT CCCCTCCAGG GTACCCACAG ATCTGCATTG 144 0 

CCCTGGTCAT TTTAGAAGTT TTTGTTTTAA AAAACAACTG GAAAGATGCA GAGCTACTGA 1500 

30 GCCTTTGCCC TGAATGGGAG GTAGGGATGT CATTCTCCAC CAATAATGGT CCCTCTTCCC 1560 

TGACGTTGCT GAAGGAGCCC AAGGCTCTCC ATGCCTTTCT ACCTAAGTGT TTGTATTTTA 1620 

TTTTAAATTA TTTATTCTGG AGCCACAGCC CCCTTGCTTA TGAGGTTCTT ATGGAGAGTG 1680 

35 

AGAAAGGGAA GGGAAATAGG GCACCATGGT CCGGTGGTTT GTAGTTCCTT CAAAGTCAGG 1740 

CACTGGGAGC TAGAGGAGTC TCAAGCTCCC CTTAGGAAGA ACTGGTGCCC CCTCCAGTCC 1800 

40 TAATTTTTCT TGCCTGCCCC GCCTTGGGGA ATGCCTCACC CACCCAGGTC GTGACCTGTG 1860 

CAATAAGGAT TGTTCCCTGC GAAGTTTTGT TGGATGTAAA TATAGTAAAA GCTGCTTCTG 1920 

TCTTTTTCAA AANAAAAAAA AAAAAAAACT 1950 

45 



50 



(2) INFORMATION FOR SEQ ID NO: 132: 



(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 990 base pairs 
(3) TYPE: nucleic acid 
(C) STRANDEDNESS : double 
55 (D) TOPOLOGY: linear 



60 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO : 132: 
TGGAAGATTT AAAATAGCTT TCATATTTCT CTTGAATATG AATATATAAG CTTGAATAAG 



60 
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10 



30 



35 



45 
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384 

tttgagtcgt tattattatg aaattttcct tattatttct accaatgct? cttatattaa 12c 

A.GCCTGA7CT 7TTTCA7ATT AGTATA7GTA CATTAGCTGC CTGTGGATTA ACATTTCCAT 15 3 

^GCATTGT TTGATCTT AA ACTTTTTGTG TCTTT AT AT A AGGTATGCTY 240 

IA7ATTTTT AA-GACAATA GTTGAAAGAC AATCTYCACC TTTTACTTGT 3 00 

AT T7TTGATGCA TATTACGTCT TATTATTTAA CCAACCTATT 360 

TAC-C-GCATTT TTCAGAAAGC CTTATTTTCT TGTATTAATC AAATATTTTT 42 0 

AYCATTGTA7 TTTCCYTTAT TAGTTAGKAA TACGKTACYC YAAATATATA TTGTGGSTAT 43) 

15 TT7CACAA7T GCAAT.-.TGCC TC CTT AATTT ATTAGAGGCT AACCTAAATT ATT ACTTTT A 54 0 

CCACTTACTT GA AA A.7TCTG GAACTTTAGA ACATTTATTG TTTTATGCAT TTTAATTCTA 600 

CT^G^^ r ~TTT TiCTACTCCT AAACATTATT ATTGTTTTAG ACAAGCCAAA ATATATNTTG 660 

20 

TTATTATCTT ATYCTCCATT TCTTTCTGTA TTTTTATGCC ACTATGTATG CTCAATTTCC 72 0 

TTCTATOTGA TGAACCTAAT TCAGTACTTT TGTTTTTTAA TCTGTGCAGG TAGCCTGGCC 780 

25 ATTAA A.TTTT TATTTTTGGT TTGCTGAAAA AATTGTGTTT ATTTCTATAT GCATACTTAT 840 

GCA7ATA-GAA 7>ICTA.GGT?7G ACATATTTTT AGTATTTATA AATGTAAAGT CATT//ATTKG 900 

GCTTCTA.TCA TTTCKGTXGA GAAATCAATT GTCAGCGCAA TAGTTTTTCA TTTTAAATTA 960 
CTiC-.ATTTTT 7CATGTT7CT GGTTTTAGGA 



(2) INECRIC-.TICN 7ZP. SEQ ID NO: 133: 



(i) SEQUENCE CHARACTER I ST ICS : 

(A) LENGTH : 17 20 base pairs 
40 (3) TYPE: nucleic acid 

(Cj STRAKEEDNESS : double 
(DJ TOPCLCGY: linear 



990 



60 



(Xi) SEQUENCE DESCRIPTION: SEQ ID NO: 13 3: 
STCTGATAAG CGACTGTGGT TATTCCCCTA AAGTTTACTT CAGCACTAAC ACTAGTGCTT 

CCGCTGGAGT TTOCAGTTTT CCAGCTTTAT ACAGGATTTT CCTTTGACTG GAAGAGTCAA 120 

50 GCA.TATAGAG ACTCAACAGT GACATTTATT GTACAACATC AAGGGGAATA GGATACTCAT 180 
CAAACTGGGA TTATTCTTAT CAAAACATGG TCTTCTTTGA ATAAGAAAAA TACATAGTTG 
GTTATTATGG ACTTAAAACT GTGTTAAATG GATATTCTGA TAAAATATTT GCTGCTCTGT 

55 

A3AGTGTOGA AAATCTGAGA A7ATTAGCTT TACTCATCTT GAGCTTTGAG GATGTTCTCT 360 

GTACGCCGAT GGTTTCATAT TAACTAAAAA AGCTGGGTAT TGTAAAATCT CATTTATAAA 42 0 

60 AACTCAGATj AGAAGAAAAT TTTCTTT GAT GGTGAGACTG TTGTCTTAGT TCAGGAAATT 48 C 



240 
300 



WO 98/54963 




15 



A7TTAATAAT CCTTTGTTAC CTGTGAA7GA AGGAAC TTTG TAATTCTGAT TTATCGTAAA 54 J 

ACATGAGCCT TTCCAGAGTC AGCTTAGACA CTGTTGTCGC AAATAGCCAT GCTTTGCCTT 6 33 

5 

ATGCCAAGGA GGCCCAGAGG GAGGGCCTAG TCTTCCTCTG TTGCTGTACA TATATTGAAA 660 

TGCTTTTTTT TTTTATTTTG CATTTGTTAT CTATAATGAG CTTTCTGAGC CCTGATATTA 72 0 

10 TGTGAGACAA ACAGGAGTTA TTGATGTTAT ACACTCCCTT CCATTCAGGA TTTTCTGCTT 780 
GGAGGGAAAT ATGTTGACCT TAGAGAATTG TGAATATTGT TGCAATTCTT GAATAT ATT A 84 0 

CCATGTGAAT AATAGAGACT GTGTTGCTCT CTAGTATAAG CTATATTTAT TTTTGATTCA 90 0 

TTTGAATTAC TAGTTATAAC TGGAGAAATT TTGTTACCTC TATCCTGGCT TGCCTGACTG 960 

GCTGTATAAT AGCAGCAGCC TCTTTTAGAG CATCTTAATG AAAACATGGA TGAAAGGAAT 1020 

20 TAATGATGAT ATCTGCAGAC TGCGTAGAAA ATGGCTTTTG TTCCCAGCGT TAACATTTTC 1080 

TTCTCAATCA CATTTCAATG TTTGTGGAGA GTGGCAGATT CACACGAGAA ACACTAGGTG 1140 

TTCATATCCA TAGCATGGAT GCAGAATAAG CAGTTGGGAG AGAAGCTTCT TCCTACCTGG 1200 

25 

TACTCCTCCC ATTCACCTCA GCCCAGCCCC AG ACAGGC GT TAGCATTCAG TGTGGGCCGT 1260 

CAGGCAGCCC TGAAGCCTGG CTGGGTCATC AGATGGGGGC AGCCTGTGAC GGGCACCAGC 13 2 0 

30 GGCCTGATTC CAGGGAAGAG TTCCTGGAGG GTGTTGGCTG TTTTTGTTAG CTCAGTTTTT 1380 

TTCTGGGCTC CACCATTCCT AACTCCAGGT AGACAAGATA GATGTCACAC ACAACAATTT 1440 

TAAAGTATTT TGCTTAGTGC ATTTTGTTTA TGATTGCAGT GTTTGTTTGT TATTTAATAG 1500 

35 

GCTTTTTACT TCATTCTATT AAATTTTAGT GTTTAGAAGA GGCGGGTACT GTCACTGTGT 1560 

AAAATATGTA ATATTTTATA TGTTATAGCA TGTCATATAT ACTTGCAATA TCAGACCTTG 1620 

40 CATTCAATAT ACAATGCAAT TGACTCTTTG CAGACCTGCA TTTTTCAGTG AACAATAAA^. 1680 

AGATTGTCTG GCACTCCAAA AAAAAAAAAA AAAAAAAAAA 1720 



45 



(2) INFORMATION FOR SEQ ID NO: 134: 



(i) SEQUENCE CHARACTERISTICS: 
50 (A) LENGTH: 705 base pairs 

£B) TYPE: nucleic acid 

(C) STRANDEDKE3S : double 

(D) TOPOLOGY: linear 

55 (xi) SEQUENCE DESCRIPTION: SEQ ID NO: 134: 

GGCACGAGGC CATCTGGGCT CATTCAGCAG GAAATAATGG AAAAAGCTGC AATATCCAGG 60 
TGTTTACTAC AATCTGGAGG CAAGATCTTT CCTCAGTATG TGCTGATGTT TGGGTTGCTT 120 

60 
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50 



60 



^7 : jvr 



386 



iAATCAC AGACACTCCT AGAGGAGAAT GCTGTTCAAG GAACAGAAC3 TAGTCTTGGA 



TCCTCATTGC CCTGTATACC TTTAAGCAAG CCAGTGGAAC TCTTAAGACT AGATTTAATG 
ACTCCGTATT 7GAACACCTC TAACAGAGAA GTAAAGGTAT ACGTTTGTTIA AATCTGGGAA 



(2) INFORMATION FOR SEQ ID NO: 135: 



(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH : 323 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : double 
30 (D) TOPOLOGY : linear 



AGGAAAAAAA AAAAAAAAAA AAC 



(2) INFORMATION FOR SEQ ID NO: 136: 



(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 582 base pairs 
(3) TYPE: nucleic acid 
(C) STRANDEDNESS: double 
55 (D) TOPOLOGY: linear 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 13 5: 
GGACGGAATG GTGCAACCCT CCTWAMTTTT CTKGXGCTGT TGACAACAGA GGGAGGGAGG 



iac 



TTAAATATAG CACCTTTTAT TAACCAGTTT GAGGTACCTA TA^GTGTA'TT' TTT3GACCTA 24 0 



303 
36C 



GACTTGACTG CTATTCCATT TTGGGTATCA TATGTACCTT GATGAAGANG ATTAGGTTGG 42 0 



430 



GATACTTCAA GTGAAGCCTC CCACTGGAAA CAAGCTGCAG TTGTTTTAGA TAATCCCATC 

CAGGTTGAAA TGGGAGAGGA ACTTGTACTC AGCATTCAGC ATCACAAAAG CAATGTCAGC 540 

15 ATCACAGTAA AGCAATGAAG AGCAGTTTTC CAATGAAAAC TGTGTAAATA GAGCATCAAC 600 

AAGTACAAAA TTCTTGTCTT AATTAGTGGG GGTATATAAA AATTCCTTGT AATGGTCAAA 66 0 

TATTTTTTAA AATTGACATT AATAAAGCAT ATTTTAAAAG TTTCT 70 5 

20 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 135: 

AGCACACACC TCCTTTAGTT GCTCCTAAGG TCATGTTCAA CATTCGTGGA GTGCATTTTC 60 

TGCTCAGGGA GCTTTCCCAG ACC CGGAATG TTTGGTGCTC ACAGACYCTG GC AAGGATCG 120 

GTATTGCTGT TCCTCAGTTT TGCCTGGGGA AATGGAGGST CAGTGACGTT CAGTGACGTG 180 

40 CCCAGAGTCA TGCCATTGGC GGGTGGCCCA GKGMTCCAGG TCTCCAGCAC CCCTCGGCCC 240 

CCTCCTCACC AGGTCACATC ATCTCCTGGA TTAGAATCTG CTCACATAGT CTGTCCTGAA 300 

323 



60 
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387 



G AAAAC ATTT TTYGTGGGAG AATCCTACYT CTGOAGSGGA GCCCTTAAGC GATKC-ATTTT 12C 

GAATCTKGAC CCTTTACCAA CTAATTTTGA AGO AAG AT AC CTTGGAAATA TTTGGCATTC 13C 

AGTGGGTTAC TGAAACAGCA TTAGTGAATT CAT CT AG AG A ACTCTTTCAT TTATTCAGGC 24 0 

AACAACTGTA C AACTTGG AA ACCTTGTTAC AGTCCAGTTG TGATTTTGGG AARGT ATC AA 3 00 

CTCTACACTG CAAAGCAGAC AATATT AGG Z AGCAGTGTGT ACTATTTCTC CATTATGTTA 3 60 

AAGTTTTCAT CTTCAGGTAT CTGAAAGTA: AGAATGCTGA GA3TCATGTT CCTGTCCATC 42C 

CTTATGAGGC TTTGGAGGCT CAGCTTCCCT CAGTGTTGAT TGATGAGCTT CATGGATTAC 43 0 

15 T CTTGT AT AT TGGACACCTA TCTGAACTTC CCAGTGTTAA TATAGGAGCA TTTGTAAATC 540 

AAAACCAG AT TAAGGTTTGA CTGGTTTCAT TTGATTTTTA AG 582 



(2) INFORMATION FOR SEQ ID NO: 137: 



(i) SEQUENCE CHARACTERISTICS: 
25 (A) LENGTH: 1021 base pairs 

{B) TYPE: nucleic acid 

(C) STRANDEDNESS : double 

(D) TOPOLOGY: linear 

30 (Xi) SEQUENCE DESCRIPTION: SEQ ID NO: 137: 

TTCGGCAGAG CCCTTGCGCG CTCTTGAATA CCTGCKTTCT GTAGCGCTAG TTCTCTTCAA 60 

GATTTGCTTA GTGTCATTTC ATTTCGGTTT CTTTTCTCGC CATGTTTTTC TGTCGGAATT 120 

ACGGTTCGTT TTGGTTCTAT GTACTCTCTA AAATGTTATC GTTTTTCATT TGTCTACTAA 180 

TTTTCGTGCA TTTGTTACTA CTGAGTTTCT TAATATCTGA CTGGCCTCCG CCCACGGGCT 240 

40 CTGCAGANCA TAAAATACTC AGGCTGATGG TAGTGCAGAG ACTCTCCCTC CTTGATCAGC 300 

GCAAACGTTG GTCTGAGGCT TGAGGGATGG AGCAACATTT TCTTGGCTGT GTGAAGCGGG 360 

CTTGGGATTC CGCAGAGGTG GCGCCAGAGC CCCAGCCTCC ACCTATTGTG AGTTCAGAAG 420 

ATCGTGGGCC GTCXXCTCTT CCTTTGTATC CAGTACTAGG AGAGTACTCA CTGGACAGCT 480 

GTGATTTGGG ACTGCTTTCC AGCCCTTGCT GGCGGCTGCC CGGAGTCTAC TGCCAAAACG 540 

50 GACTCTCTCC TGGAGTCCAG AGCACCTTGG AACCAAGTAC AGCGAAGCCC ACTGAGTTCA 600 

GTTGGCCGGG GACACAGAAG CAGCAAGARG CACCCGTAGA AKARGTGGGG CAGGCAGARG 660 

AACCCGACAG ACTCAGGCTC CRGCAGCTTC CCTGGAGCAG TCCTCTCCAT CCYTGGGACA 720 

GACAGCAGGA CACCGAGGTC TGTGACAGCG GGTGCCTTTT GGAACGCCGC CATCCTCCTG 780 

CCCTCCAGCC GTGGCGCCAC CTCCCGGGTT TCTCAGACTG CCTGGAGTGG ATTCTTCGCG 840 

60 TTOGTTTTGC CGCGTTCTCT GTACTCTGGG CGTGCTGTTC ACGGATGTGT GGAGCTAAGC 900 
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AGCCTTAGAT AGCAGCAGAA GGCTTTTTGG ATTCTCCTCC TTGAAAAGAT TCTCAGTTAC 95 C 

CAAACGTCTC CACCTAGAAA ATAAAAAT A I ATTAAGATGT TGANAAAAAA AAANAAAAAA 1020 

A 1021 

(2) INFORMATION FOR 3EQ ID NO: 138: 



(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 17^7 base pairs 
15 (3) TYPE: nucleic acid 

(C) STRANDEDNESS double 

(D) TOPOLOGY: linear 



(xi ) SEQUENCE DESCRIPTION: SEQ ID NO: 13 8: 

CGGAAGATGA TGGCTTCAAC AGATCCATTC ATGAAGTGAT ACTAAAAAAT ATTACTTGGT 6 0 

ATTCAGAACG AGTTTTAACT GAAATCTCCT TGGGGAGTCT CCTGATCCTG GTGGTAATAA 12 0 

25 GAACCATTCA ATACAACATG ACTAGGACAC GAGACAAGTA CCTTCACACA AATTGTTTGG 18 0 

CAGCTTTAGC AAATATGTCG GCACAGTTTC GTTCTCTCCA TCAGTATGCT GCCCAGAGGA 240 

TCATCAGTTT ATTTTCTTTG CTGTCTAAAA AACACAACAA AGTTCTGGAA CAAGCCACAC 3 00 

30 

AGTCCTTGAG AGGTTCGCTG AGTTCTAATG ATGTTCCTCT ACCAGATTAT GCACAAGACC 3 60 

TAAATGTCAT TGAAGAAGTG ATTCGAATGA TGTTAGAGAT CATCAACTCC TGCCTGACAA 42 0 

35 ATTCCCTTCA CCACAACCCA AACTTGGTAT ACGCCCTGCT TTACAAACGC GATCTCTTTG 480 

AACAATTTCG AACTCATCCT TCATTTCAGG ATATAATGCA AAATATTGAT CTGGTGATCT 540 

CCTTCTTTAG CTCAAGGTTG CTGCAAGCTG GGAGCTGAGC TGTCAGTGGA ACGGGTCCTG 600 

GAAATCATTA AGCAAGGCGT CGTTGCGCTG CCCAAAGACA GACTGAAGAA ATTTCCAGAA 660 

TTGAAATTCA AATATGTGGA AGAGGAGCAG CCCGAGGAGT TTTTTATCCC CTATGTCTGG 720 

45 TCTCTTGTCT ACAACTCAGC AGTCGGCCTG TACTGGAATC CACAGGACAT CCAGCTGTTC 780 

ACCATGGATT CCGACTGAGG GCAGGATGCT CTCCCACCCG GACCCCTCCA GCCAAGCAGC 840 

CCTTCAAGTT CTTTTATTTC TGGGTAACAG AAGTAGACAG ACAGGTTACT TGGTGTATCT 900 

TCTGTTAAAG AGGATTGCAC GAGTGTGTTT TCCTCACACA CTTTGATTTG GAGAATTGGT 960 

GCTAGTTGGC AATAGATAAC TCAGCGTAGA TAGTATTGCA AAAAGGGGAG GAAATACACA 1020 

55 ACAATAATAA ATGTAAAAAC CTGCTATTCA ACATGCAGTT TTATTTCGAR GCCAAAAATC 1080 

TAGAGCTTTC CCAAGATCCT GTTGCCTTAG GCACATNCAC ACTTCAACAG TGCACACTAT 1140 

CCAACAGTGC ACACTATTCA ACAGTGCACA CTATTCAAAA GCGTAGACTA TTTTTTTGCA 1200 

60 
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7GTTCAAGAT ATTTGTTTTG GTCTTATGTG 
AAGGAGGATC AATGAGAAAA GATGATGAGG 
5 TGTTTGGTTG CCTGTCAGAG GGCACACAAT 
ATTAATATTT AACACCTCTG CATCTTTTTC 
CTCACATTTG TAATCCCAGC ACTTTGGGGA 

10 

AATCTGAAAC CAGCTTGGGC AACATAGTGA 
AGCCAGGCA7 G ATGGC AC AT TCCTGTAGTC 
15 TTGCCTGAGC CCAGGAGTTC AAGGCTGCAG 
CTGAGCCACA AAGTG AGAC C CTGTCTCGCA 
CCGGTACCCA AATCGC CGG A TATGATCGTA 

20 



^^CT L S98 11422 

TGTGTGAGAG AGAGAGATTC CTTTGACATT 1250 

CAGGAATTAA TAAAGAAATG AAGTCGTGTG 1320 

TTCATAAACA CCATGCCTGG ACAATTTGAT 13 BO 

TTAAAAAAGA ATATGGGCCA GATACAGTGG 14 4 0 

GCCAAGTTAG CAGAATCCCT TGAGCACAGG 1500 

GATCCCATCT MTACAAAAA.A CTTAAAAATT 1550 

CTAGCTACTC AGG AGGCT AA GGTAGGAGGA 1620 

TGAGCTAAGN ACGTGCCAGT ACACTCCAGC 1680 

AAAAAAAAAN TTAAAAAGTC GGGGGGGGGC 1740 

AACAATC I 777 



25 



35 



45 



(2) INFORMATION FOR SEQ ID NO : 139: 



(i) SEQUENCE CHARACTERISTICS : 

(A) LENGTH: 643 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : double 
30 (D) TOPOLOGY : linear 



60' 

.20 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 139: 
TTTTTTTTTT TTTTTTTTTT TTTTTTTTTT TTTTTTTGGG AATGAGAAAA TAACTTTATT 
TTCATTGTGG GGAGCGGGCC GATGTCCAGC CTCAGAACTT CTGGAACTGC TTCTTGGTGC 
CGGCAGCCTT GGTGACCTTG AGCACGTTGA AGCGCACTGT CTTGCTCAGA GGCCGGCACT 180 
40 CGCCCACTGT GACGATGTCA CCGATCTGGA CGTCCCTGAA GC AGGGGG AC AGGTGTACAG 240 
ACATGTTCTT GTGGCGCTTC TCGAAGCGGT TGTACTTGCG GATGTAGTGC AGATAGTCTC 300 
GGCGGATGAC AATGGTCCTC TGCATCTTCA TCTTGGGTCA CCACGCCAGA GAGGATCCGC 360 
CCTCGAATGG ACACATTACC AGTGAAGGGG CATTTCTTGT CAATGTAGGT GCCCCTCAAT 420 
AGCCTCCTTG GGGTGTCTTT GAAGCC CAG A CCGATGTTCT TGTTAGTAAC CCGCGGGAGC 480 
50 TTCTCCTTGC CAGTTTCTCC CAGCAGGACC CTCTTCTTGT TTTGAAAGAT GGTCGGCTGC 
TTTTGGTAGG CACGCTCAGT CTGAATGTCC GCCATCTTCT CGTGCCGMAY TCCTGCAGCC 



540 

6C0 



55 



CGGGGGATCC ACTAGTTCTA GAGCGGCCGC ACCGCGGTGG AGC 643 



(2) INFORMATION FOR SEQ ID NO: 140: 

60 



10 



20 
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390 



:i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 1220 base pairs 

(B) TYPE : nucleic acid 

(C) STRANDEDNESS: double 
{ D ) TOPOLOGY: linear 

(Xi) SEQUENCE DESCRIPTION: SEQ ID NO: 14C : 

GGCACGAGGA TGATAGACCT ACTGGAGGAA TACATGGTTT ACAGGAAGCA TACCTACATR 60 

AGGCTTGATG GCTCATCCAA GATCTCGGAG AGGCGAGACA TGGTTGCTGA TTTTCAGAAC 120 

AGGAATGACA TCTTTGTGTT CCTGTTAAGC ACACGAGCTG GAGGACTGGG TATCAATCTC 130 

15 ACTGCTGMAG ACACAGTGCA TTTTCTATGA TAGCGACTG3 AACCCCACTG TGGACCAGCA 243 

GGCCATGGAC AGGGCCCACC GCTTAGGGCA GACAAAGCAG GTTACTGTGT ACCGGCTCAT 300 

CTGTAAAGGC ACCATTGAAG AACGCATTCT GC AAAGAGCC AAGGAGAAGA GTGAGATTCA 360 

GCGGATGGTG ATTTCAGGTG GGAACTTCAA ACCAGATACC TTGAAACCCA AAGAGGTGGT 42 0 

TAGTCTTCTT CTAGACGACG AAGAGTTGGA GAAGAAACGT ATGTACTCTA AACCTCTATA 480 

25 CACTCCCCTC ACGTATCTGA GAATGGAAGA GGTACTTGGS TGTGTGCCAA GGGTTAGGCA 540 

AAGCCAGAGG CTGTATTTAG GGAAAGTATT TTTGTGCTCA TATTTTATAT AAAAACCCAA 600 

ACAAGAATGT GTTTGTAGGC CAGGCGTGGT GGCTCGCGCC TCTAGTCTCA GCATTTCGGG 660 

ARGCCAAAGT GGGCAGATCA CCTGARGTCA GGAF.TTTGAG TTTGARACCA GCCTGGCCMA 720 

CGTTGTGAAA CCCCACCTCT ACTARGARTA CSGAAAATTG GTTGGGCATG GTGGCGGGCA 7 80 

35 CCTGTAATTC CAGCACTTTG GGAGGCTGGG GCAGAAMAAT TGCTTGAGCC CAGGAGGTGG 840 

AGATTGCGGT GAGCCGAGAT YGTGCCATTG CAKTCCAGCC SGGGCAATAA GAGTGAAAYT 900 

CCATCTTTTA AAAACAAACA AAAACAAAAA ACACAAGACG GCTCACACCT GTAATCCCAG 960 

CACTTTGGGA RGCCGARGCA GGTGGATCAC GARGTCAGGA GTTCCAAGAC TAGCCTGGCC 1020 

AACCTGGTGA AGCCCCGTCT CTACTAAAAA TACMAATATT AGTCGGGCGT GGTGGTGGGC 1080 

45 ACGTGTAATC CCAGCTACTC GGGAGGCTGA GGCAGGAGAA TCCCTTGAAG CTAGGAGGCA 1140 

GAGGTTGCAG TGAGCCAGGA TCGTGCCATT GCACTCCAGC CTGGACAACA AGAGCAAGAT 1200 

TCCATCTCAA AAAAAAAAAA 1220 

50 



30 



40 



55 



(2) INFORMATION FOR SEQ ID NO: 141: 



(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 721 base pairs 
(3) TYPE: nucleic acid 
(C) STRANDEDNESS: double 
60 (D) TOPOLOGY: linear 



WO 98/54963 




T US98/11422 



(XI' SEQUENCE DESCRIPTION: SEQ ID MO: 141: 

AATTCGGCAC 3AGCCAGGTT AGCC3GAAGG GCAGCTCTCC AGGCCCTGCC CACCCCACAG 60 

5 

GGGGCTCCTT ATGCACAGCG GGGCGTCTCC TTGTGGCCAT AGAAACCJGAA CTGGCTCTTT 12-. 

TCAACAGTGC TGCAAGAGGA TGGTTATTTA AIGCTGGCCC CCAAGGAGGA AAG3CACACA 130 

10 CYTTCCTCCC TCCTGGAACA TCCAAGGGCA CTGGATCCTC TGTGTCCCTC TGAGATGGGG 240 

TGCCACTCCA GCAAGAGCAC CACGGTGGCA GCTGAGTCCC AC^AGCTTGA AGAAGA3YX 300 

GAGGGAAGAG AGCCAGGTCT GGAGACCGGC ACCCAGGCAG CAGACTGCAA GGATGCCCCG 3 6C 

15 

CTGAAGGATG GAACCCCTGA GCCAAAGAGC TGAAATGCCT CTCTCCAGAG TCGGACCCTC 42 0 

ACCTCYTTCC TGGAACTGCC TTTGGCCCCA GAACCATGAG ACAATCCCCA CC CTG A'GAAG 430 

20 CTCCGATCAC TGGGAGGAGA GAG AAAGC CT CCAGCTTTGG GATTCAGGCT TCAGAAGTTT 540 

TTAGCAGCCT TTGCTCATTG GAGAGGTGGG GAAAGGATAA AGTTCTTATA AGGAAATCCC 600 

TAATTTCCCC CAGCTCCTCC CCNCCNGAAG AAGGAACNAA AGAAAGTTCC TTCCACACGT 66 0 

25 

TTTGTTGGAA ACTTTTCCCT TGCCAACTTT CCTTGGATTG CCAGAACAAA GCCCTCCAGA 720 
A 



30 



50 



(2) INFORMATION FOR SEQ ID NO: 142: 



721 



35 (i) SEQUENCE CHARACTERISTICS : 

(A) LENGTH : 14 69 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : double 

(D) TOPOLOGY: linear 

40 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 142: 
ATGAATTAAT GTTTATAAAT GACTGTACTG AATTTAAAAC CGTACAGTTT C\TTTGCATT 60 
45 TTGACATTAC TTTATTATAC ATTTTGCATT TAAAAGGCTG CACCAGTTGG CTTTTCTTCT 120 
GTTTTATTCT CAAAATATAG AGATTCTGTG ATTTATTTGC CCTGTTTATG GATTAAAAAG 180 
AAAATTCTAA TAT AAAGC AT TTCAATAGGA TGCATAGGTA TATTACGTTT TTTAAATGCT 240 
TTAGATCTGT GATTCTTGAC TTACTATTTA TTTTATCCCC TTTAAGTCAG GGATGCTTTA 300 
TTCTATTTTA AAGCACTTAT GAGTTACATG TTGTAATCAA GTTTGCACAA TATATTTATC 360 
55 TATATGAGGA ACCCATAAAT GAATAGCTAA TTTTTAAAAT GCCATTAAAA TGCATGAAAT 
KCTTATTAAA ACCTTACTAT ACTATTTCTT CAAGGCAAGT AAATTGACCA TGRGRAAAGR 



420 
480 



60 



ACACAGTTAT TAAACACTGT TGACAGGAAA ATTCTCCTTG ATAA'GATAGG ACAATTAATG 54 0 



10 
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GAAAAAAAAA TTCTCATTAT TTGCAAAGAA TGAACAAGTT AA.TGAACAAA CAAACTAGAT 60 C 

TTGGTATGTT TTCAGCTTTT GTATCATGTT TAATTGTTTA AT/TTGGTTGA AAAACTGCAG 660 

TTGAGAAATC AGATAGCAAT ATAGACATTC ACAGCAGCTC TGTGGATACC ATGTAATTGT 72 0 

CAGGTAATT7 CAGAATGTTG AAAATTATTC AGTGCAGCCC 7CATAGTATC ATACTTGAAG 7 30 

AAATTGA1TA CAGTTCCACT AAATTGTTGA AGATAAATTA TTTTTAAAGG TTATGAAAAC 840 

TAAGTTATAT TAATTCATAT GTTTGATTTT TAAATCCC AC CTCCTCAAGC TATCCAATTT 90 C 

NCTGACTTTG AAAAT AAC C A TGAGAGATGC CACATTTCTC TCTGGGAAAG TACCACTCAA 960 

15 AGAATAATTG TTAAAAATTA AGCTTTTAGG T ATT AG AA 3C TGTTATAAAG TATAAAATTA 1020 

AGATATAAGC AGATCACATG TAAATCATTC CTAAAGCACA AGAAAAGAAT GTGCCTTGAT 108 0 

GTACATATAT TACTAAGTTG CCTCTCCCAG TTTACTTTAA AAATGGCTTT AAGGATAAAG 114 0 

20 

AATAAATGTG ATAGCTGTGC ATGCATTATA TATTTGCATT TGCAAATTTC CCATTGTTTT 1200 

AACAGCTGTG TGGCTGACTT TCAATTTTAA GACGTGAATT GACATACAGC CCATAACTTT 1260 

25 ATAATGGCTG CTCATTTATC TTATCTTTCA GTTAGTGGAA AAACATTTCA AC CTG ACT AA 13 20 

AATTTGGAAT TGTGTCTTTT ATGTTCCATC CTCTGTTGTT ACTAGATTTA GTTTAAAAAT 1380 

TGTGTATGAC CATTAATGTA TGTCATAAAC ATGTAAATAA AAGATGTTGA ATCTTGTTGA 1440 

AAAGCAWRAA AAAAAAAAAA AAACTCGA 1468 



30 



35 

(2) INFORMATION FOR SEQ ID NO: 143: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 300 base pairs 
40 (B) TYPE: nucleic acid 

(C) STRAND EDNE 3 S : double 

(D) TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 143: 

45 

TGAATTTTTT GCCAAACTTA GTAACTCTGT TAAATATTTG GAGGATTTAA AGAACATCCC 60 

AGTTTGAATT CATTTCAAAC TTTTTAAATT TTTTTGTACT ATGTTTGGTT TTATTTTCCT 120 

50 TCTGTTAATC TTTTGTATTC RCTTATGCTC TCGTACATTG AGTACTTTTA TTCCAAAACT 180 

AGTGGGTTTT CTCTACTGGA AATTTTCAAT AAACCTGTCA TTATTGCTTA CTTTGATTAA 240 

AAAAAAAAAA AAAAAAAAAA AAACCCCNAG GGGGGGGCCG GGTNCCCAAT CCCCCCCAAA 300 



60 



(2) INFORMATION FOR SEQ ID NC : 144: 



WO 98/54963 



CT/US98/11422 



39? 



( i ) SEQUENCE CHARACTERISTICS : 

:Aj LENGTH: 2243 base pairs 
(3) TYPE: nucleic acid 
( C ) STRANOEDNES S : double 
5 iD) TOPOLOGY: linear 



(xi) SEQUENCE DESCRIPTION: SEQ IE NO: 144: 

TGCCTCCCTT CCTGCAGATT GTGGACAGTA GTTCCTCAGC CTGCACCCTG GATTCCTTCT 6 0 

10 

TCCCCTTCCT AGCTCCATGG GACTCGCCCC AAGACTGTGG CTTCAAOjAC CACCAGCCCC 120 



TTACTCTTCA AGCCCTGACT GTGGAGTTGG TAGATGCCTC TGATCCTCAG TATTCTCTCT 180 



15 GGCAATGTTC CACGGCTTCT CCTTCCTGGG AGCTGGCTCC ATAACTTGAT TTTCCCCAAA 240 



CGTGTTGCAA TCCCTGCTGC CCCTTAGCCA CCCAGGGTCT TGTGTGGGTA TGAGTGTAGA 300 



GGATGGGGGT ATGCCAGGCC TGGGCCGTCC CAGGC AGGCC CGCTGGACCC TGATGCTACT 360 

20 

CCTATCCACT GCCATGTACG GTGCCCATGC CCCATTGCTG GCACTGTGCC ATGTGGACGG 420 



CCGAGTGCCC TTYCGGCCCT CCTCAGCCGT GCTGCTGACT GAGCTGACCA AGCTACTGTT 480 



25 ATGCGCCTTC TCCCTTCTGG TAGGCTGGCA AGCATGGCCC CAGGGGCCCC CACCCTGGCG 540 



CCAGGCTGCT CCCTTCGCAC TATCAGCCCT GCTCTATGGC GCTAACAACA ACCTGGTGAT 600 

CTATCTTCAG CGTTACATGG ACCCCAGCAC CTACCAGGTG CTGAGTAATC TCAAGATTGG 660 

30 

AAGCACAGCT GTGCTCTACT GCCTCTGCCT CCGGCACCGC CTCTCTGTGC GTCAGGGGTT 720 



AGCGCTGCTG CTGCTGATGG CTGCGGGAGC CTGCTATGCA GCAGGGGGCC TTCAAGTTCC 780 



35 CGGGAACACC CTTCCCAGTC CCCCTCCAGC AGCTGCTGCC AGCCCCATGC CCCTGCATAT 840 



CACTCCGCTA GGCCTGCTGC TCCTCATTCT GTACTGCCTC ATCTCAGGCT TGTCGTCAGT 900 



GTACACAGAG CTGCTCATGA AGCGACAGNG GCTGCCCCTG GCACTTCAGA ACCTCTTCCT 960 

40 

CTACACTTTT GGTGTGCTTC TGAATCTAGG TCTGCATGCT GGCGGCGGCT CTGGCCCAGG 1020 



SCTCCTGGAA GGTTTCTCAG GATGGGCAGC ACTCGTGGTG CTGAGCCAGG CACTAAATGG 1080 



45 ACTGCTCATG TCTGCTGTCA TGAAGCATGG CAGCAGCATC ACACGCCTCT TTGTGGTGTC 1140 



CTGCTCGCTG GTGGTCAACG CCGTGCTCTC AGCAGTCCTG CTACGGCTGC AGCTCACAGC 1200 



CGCCTTCTTC CTGGCCACAT TGCTCATTGG CCTGGCCATG CGCCTGTACT ATGGCAGCCG 1260 

50 

CTAGTCCCTG ACAACTTCCA CCCTGATTCC GGACCCTGTA GATTGGGCGC CACCACCAGA 1320 



TCCCCCTCCC AGGCCTTCCT CCCTCTCCCA TCAGCAGCCC TGTAACAAGT GCCTTGTGAG 1380 



55 AAAAGCTGGA GAAGTGAGGG CAGCCAGGTT ATTCTCTGGA GGTTGGTGGA TGAAGGGGTA 144 0 



CCCCTAGGAG ATGTGAAGTG TGGGTTTGGT TAAGGAAATG CTTACCATCC CCCACCCCCA 1500 



ACCAAGTTCT TCCAGACTAA AGAATTAAGG TAACATCAAT ACCTAGGCCT GAGAAATAAC 1560 

60 
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394 

CCCATCCTTG TTGGGCAGCT CCCTGCTTTG TCCTGCATGA ACAGAOTTGA 7GAAAGTGGG 16 2 0 

GTGTGGGCAA CAAGTGGCTT TCCTTGCCTA CTTTAGTCAC CCAGCAGAGC CACTGGAGCT 1680 

5 GGCTAGTCCA GCCCAGCCAT GGTGCATGAC TCTTCCATAA :3GGATCCTCA CCCTTCCACT 174 0 

TTCATGCAAG AAGGCCCAGT TGCCACAGAT TATACAACCA TTACCCAAAC CACTCTGACA 13CC 



GTCTCCTCCA GTTCCAGCAA TGCCTAGAGA CATGCTCCCT GCCCTCTCCA CAGTGCTGCT 

10 





CCCCACACCT AGCCTTTGTT CTGGAAACCC CAGAGAGGGC TGXCTTGAC TCATCTCAGG 


1920 




GAATGTA3CC CCTGGGCCCT GGCTTAAGCC GACACTCCTG AC2TCTCTGT TCACCCTGAG 


1980 


15 


GGCTGTCTTG AAGCCCGCTA CCCACTCTGA GGCTCCTAGG AGGTACCATG CTTCCCACTC 


2040 




TGGGGCCTGC CCCTGCCTAG CAGTCTCCCA GCTCCCAACA GCCTGGGGAA GCTCTGCACA 


21C0 


20 


GAGTGACCTG AGACCAGGTA CAGGAAACCT GTAGCTCAAT CAGTGTCTCT WTAACTGCAT 
AAGCAATAAG ATCTTAATAA AGTCTTCTAG GCTGTAGGGT GGTTCCTACA ACCACAGCCA 


2160 
2220 




AAAAAAAAAA AAAAAAACTC GAG 


2243 


25 


(2) INFORMATION FOR SEQ ID NO: 14 5: 




30 
35 


(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 1082 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : double 

(D) TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPIIUN: bty ID nu. if* 3 . 






GCCAAGCTCT AATACGACTC AL I AI ACjCJOA muliujIal l^liva-aom rtuu^ji 


60 


40 


GGAATTCCCG GGTCGACCCA CGCGTCCGCT TCCGTGTGTC AAAATCCTCA CCTCCTTCAT 


120 




AACCATCTCC CACAATTAAT TCTTGACTAT ATAAATTTAT GGTTTGATAA TATTATCAAT 


180 


45 


TTGTAATCAA TTGAGATTTC TTTAGTGCTT GCTTTTCTGT GACTCAACTG CCCAGACACC 
TCATTGTACT TGAAAACTGG AACANCTTGG GAATGCCATG GGGTTTGATA ATCTGCCAGG 


240 
300 




GACATGAAGA GGCTCAGCTT CCTGGGACCA TGACTTTGGC TCAGCTGATC CTGNACATGG 


360 


50 


GAGAACAACC ACATTTTTCT TTGTGTGTGC TTCTAGCAGC TGTTCGGGAG GACCKTGACC 


420 




CAAYAGTGTT CCCATGCTGT TTCTTGTGAA ATGCTCTCGG CTATGTAGCA GCTTTTGATT 


480 


55 


CCCTGCATAC CCTAGGCTGC TGCCCCTATC CTGTCCCTTG TTTATAACAT TGAGAGGTTT 
TCTAGGGCAC ATACTGAGTG AGAGCAGTGT TGAGAAGTCG GGGAAAATGG TGACTACTTT 


540 

630 




TAGAGCAAGG CTGGGCATCA GCACCTGTCC AGCTCTACTT GTGTGATGTT TCAGGAACTC 


660 


60 


AGCCCCTTTT TCTGCCTAGG ATAAGGAGCT GAAAGATTAA CTTGGATCTY CTAATGGTCC 


720 
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AAAT CTTTTG GTCACAATAA AGAGTCTCCA AATTAGAGAC TGCATGTTAG TTCTGGATGG 7 30 

ATTTOGTGGC CTGACATGAT ACCCTGCCAG CTGTGAGGGG ACCCCGTTTT TAACATGCAT 34 0 

5 

GGCCAAGCTC TCTGCAAATG GAAATGCTTA CACTGGGTGT TGGGGATGTT TGCTACCTCC 930 

TGCTATTTTT GTGGTTTTGG TT:TCCCACT ATGGTAGGAC CCCTGGCCAG CATTGTGGCT 960 

10 TGTCATGTCA GCCCCATTGA CTACCTTCTC ATGCTCTGAG GTACTACTGC CTCTGCAGCA 1020 

CAAATTTCTA TTTCTGTCAA TAAAAGGAGA TGAAAATAAA AAANAAAAAA AAAAAACTCG 10 3 G 

1032 



MG 



15 



20 



30 



50 



(2) INFORMATION FOR SEQ ID NO: 146: 



(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 4313 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : double 
25 (D) TOPOLOGY: linear 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 146: 
CAAGCTGGTT TGAAACTAGG GGTCGGGCTC GGCCGTCGTC GTTGTTTGTC GCCGCATCCC 60 

CGCTTCCGGG TTAGGCCGTT CCTGCCCGCC CCCTCCTCTC CTCCCTTCGG AC C CAT AG AT 120 

CTCAGGCTCG GCTCCCCGCC CGCCGCAGCC CACTGTTGAC CCGGCCCGTA CTGCGGCCCC 18 0 

35 GTGGCCACCA TGTCCCTGCA CGGCAAACGG AAGGAGATCT ACAAGTATGA AGCGCCCTGG 240 

ACAGTCTACG CGATGAACTG GAGTGTGCGG CCCGATAAGC GCTTTCGCTT GGCC-CTGGGC 3 CO 

AGCTTCGTGG AGGAGTACAA CAACAAGGTT CAGCTTGTTG GTTTAGATGA GGAGAGTTCA 350 

40 

GAGTTTATTT GCAGAAACAC CTTTGACCAC CCATACCCCA CCACAAAGCT CATGTGGATC 420 

CCTGACACAA AAGGCGTCTA TCCAGACCTA CTGGCAACAA GCGGTGACTA TCTCCGTGTG 4*0 

45 TGGAGGGTTG GTGAAACAGA GACCAGGCTG GAGTGTTTGC TAAACAATAA TAAGAACTCT 540 

GATTTCTGTG CTCCCCTGAC CTCCTTTGAC TGGAATGAGG TGGATCCTTA TCTTTTAGGT 600 

ACCTCAAGCA TTGATACGAC ATGCACCATC TGGGGGCTGG AGACAGGGCA GGTGTTAGGG 660 

CGAGTGAATC TC3TX3TCTGG CCACGTGAAG ACCCAGCTGA TCGCCCATGA CAAAGAGGTC 720 

TATGATATTG CATTTAGCCG GGCCGGGGGT GGCAGGGACA TGTTTGCCTC TGTC<jGTGCT 730 

55 GATGGCTCGG TGCGGATGTT TGACCTCCGC CATCTAGAAC ACAGCACCAT CATTTACGAA 340 

GACCCACAGC ATCACCCACT GCTTCGCCTC TGCTGGAACA AGCAGGACCC TAACTACCTG 900 

GCCACCATGG CCATGGATGG AATGGAGGTG GTGATTCTAG ATGTCCGGGT TCCTGCACAC 960 

60 



WO 98/54963 
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CTGTSGCCAG GTTAAACAAC CAT CG AC-CAT 
CATCCTGCCA CATCTGCACT GCAGCGGA7G 
5 AAATGCCCCG AGCCATTGAG GACCCTATCC 
AATGTGCAGT GGGCATCAAC T CAGCCCG AA 
AGATACTCAG AGTGTAGTGT TGGTGGCGCT 

10 

CTGCCTCTGC CCCACCCCCA AAGTAAGAAG 
CATTGCTTTG CACCCACTGT TACCAGAAGC 
15 TCGCCCTCTG TGGCAGACTC AGTGCTGTGT 
AGATTTTCTC TCCTTTCCTC TTCTCCTTTG 
GTTTGTCAGG CGTTGTGTTG AGGAGCAGTT 

20 

AGGTGTCTCT GTTTGCTGCC CAAKGYWKKT 
TTAGCACTWA CGTGGGAACA AATACCAATT 
25 CAAATTTTAA CTTTGTATAT TTGTTATCTA 
ACTCTCCTGC TTCATTTCTT TGTCTTATAG 
TCAGTGCCTG GAGCTGGTAC TGGGCCCCTG 

30 

CTGCCTGTGT AGTACATACC TGACCGGGAG 
TGACTCATCA CACCTTTCTT AGCCTGGCTC 
35 ACATAGGAAG CCTCTGTTTA CCCTGAAGCA 
AGCATGGTAG AGCTGAGAGA AACAGGCTCT 
ATGAAGCTGA ACTTCAAGCA TATTTCCAGT 

40 

AATATAAGCC CCAGGCCATT CCACTTAGTG 
GAGTTGAACT TCGGTGCTTC TGTTGTTTGA 
45 TCTTTGGATT GAGTGTTCTG AGGTGAGAGA 
AACCCTGAAC AAGACCTTAC ATGAGAGATG 
CAAGTGGATA GATAGTTAAA AAGCATTATA 

50 

AGAAGGAAAA GGAATTATAG ACCCCCAGGG 
TCAACCCCTC TCTCCCCCAG TTTAGGTTCT 
55 TCTTTTGACT TGCAGGCCGC AGTGTCTTTC 
TATGTGTGAT TCCACCGTTA GATGAGCCCT 
GGGAAAGTTG GCTGTTTCCT TGCGCTCTGC 

60 



396 

GTGTCAATGG CATTGCTTGG GCCCCACATT 1020 

ACCACCAGGC TCTCATCTGG GACATCCAGC 10SC 

TGGCCTACA2 AGCTGKAAGG WGAGATCAAC 1140 

YTGTCGCCAT CTGCTACAAC AACTGCCTGG 1200 

GTGCCCACGA GGCAGGGGCT TTTGTATTTC 12 60 

AAACATGTTT CCAGTGGCCA GTATGTCTTT 1320 

TGCTCTAGGA GTTCCTGGCC AGTCACCCCA 1380 

GGCGCCTCCT CAGCCCAGGG CTGAGTTTTA 1440 

GTTCCTCAAT TAAAAAATGT GTGTATATTT 1500 

CACGCACTGG CTGTGTCTAT TCCTCTGCCC 1560 

TTTCATGTCT CGTCCATGTC CATGTTCGTG 1520 

TGTCTTTTCT CCTAGTATCA GTGTGTTTAA 1680 

TCAGGCTAAT TTTTTTATGA AAAGAATTTT 1740 

TCCTCCCTCT TTGCACCTTC TTCTCTTCCC 1800 

GCCCCATGAG CAGTTTGCCT TCTTGAGTCA 1860 

TCCAAACCAC CTTGGTGCTC TGAAGTCCAC 1920 

CTCTCAAGGG CATTCTGGGC TTGTAAACAG 1980 

CCACTGTCCA GCCCATTGGT TCCCACTGGC 2040 

CAGGGTACCT GACTTGAGGG GAATCGTTTC 2100 

ACATTCTTTC AGAGTCTGTT TTTCCATCCA 2160 
TCTTTTCAAT GATAGGCAAG AATGATATCT 2220 
GTTTACTGTG CCTGGTGGTA TATTGGGCAT 2280 
GTCTTCCCGA GGCATCCTGT CTGTGCTTCC 2340 
GACTGATGGA CTGCGGCAAT CCTGGGCTGT 2400 
CTGTGGGTAA TGAAAAGGGA GGAAAAAAAA 2460 
TCAGCCAGTT AAGAGCTCTA CCCACACCTG 2520 
GAGCAGTATT GGACTTGTAG CCTGCAGTTG 2580 
TGTTATGTGA ATGAGTTCCA TGGAGGGGCA 2 640 

TGGGGCAGGC AGTTTGGGAT GTGCTCTTGG 2700 
TCCTACCCGA AGTTTTTAAG TCCCTCTGAA 2760 



WO 98/54963 



T/US98/1142: 



397 



10 



20 



40 



50 



55 



(2) INFORMATION FOR SEQ ID NO: 147: 



(l) SEQUENCE CHARACTERISTICS : 

(A) LENGTH: 1183 base pairs 
60 (B) TYPE: nucleic acid 



TTGCTCATC? GAGATTAGTA GAGTAGCAGG CCTGAAGGAT GATGGTTTTG TCCTCTTTGG 

TTCTCACCTG CTTGAGAAGT AAAACAGTAA CTTTGTTCTT CTGGGCCCTT AAGCTTTTTT 2 330 

GGTTAAGTCT TCCTTTTCAG AAGTAGATGT CATTATATGC CAAAAGTCTA GCTCTTTGCT 2940 

TTACCATACA GGGACCTGTC CCAAAGAAAA AGGCTCTTTT TTTAGCCAGC ATATTTCCCC 30CC 

TTCTACCCTT TTACTTTGTT GTTCTGATTT TAGGACTCTG GCTGGCCATG TGCTTGTGGT 30 60 

TGCCTCTCCT GO ATTTGCC A CTGGATTTGC ACTGCATCGT TTGGAGATAC AAAGCGAGCA 3120 

GTTCTTGGTC AGAACCCTCC TCTGCTTTTC ATTGTGTTTG ATAATGGTTA CTGGGTCCTT 3180 

15 CTCTCAAGGG TAGCAAGGCC AAGCTGATGG CTGCTTGTTT AGG AGGC CAT CAGTTCCTTC 324C 

CTGTGGAGAA GGGTCTGAAA TGGAAGTCAG TGGTAGAAGG GGCTGGTCTG CTGGGCAGGG 33 0C 

CTTACATCCA CTGAGTTCTA AGATTCCTTT CCTGATCTGC ACCTACGCCT GGTCTGTATG 33 60 

GTGGAATTTG TCAGCTGGAA CTCAGAAACA ACAACTTGAA AAAAAAATAA TAATTAGAAC 3420 

ATATTTGCAT AAGATAGCTA TTTACTCTGG AAACCAACAA CTTTTGAGAT TTCCCTTGCC 3480 

25 CTGTGGACGC CCAGCTCCTG TCATCCTTCC TTAGGTCCTG CAGTACAGTC TTCCCCTGAA 3540 

TGCCACCGGG GACCCAGGGG GACTCCACCC CCCTAAGCAA GCACACACAT ACTCACAGTT 3600 

GATGAGTTGC TGGTCTTTGA GTCCCAGCTC TCTTACCCTC CCTITACTCC ACCAGCCCGA 3660 

30 

CGACCCATGA CTGAGGAGGG GATTTCTACA GTCTCAGGAT TTAGAAAGTC TGTAAGCCAT 3720 

CCATGCTCCA GAAAGCACCG ATCTGTTGTA GTTGCAAAAA CAACTCTGTA ATTTGTTGAG 3780 

35 GTTCTCAAAC TGACAGCCAG CGAGACTGGG TGGGAGGCCC TGGATC TGTT CTCCCTGACT 3840 

GCGGGAGGAG CAGCCACTAG GACTTTAGCA GGAAGCCCAC ATGGAGGCTC CGCCAGGCTG 3900 

TGGCCCAGCT GGTGATGGCC CTTTTGCTCC TGGCAGCCTG AGGCACAGCT GC CTGTATTG 3960 

TCCTCATCTG TTCTGACTGA AGGATGGAGG TGCTGAATAA ATTAGGCCTC AGGCNTCTAC 4020 

CACCAGAGAG CTGGAGAATG GGTCCACGTC ATTCAAGGAC CTGAATTTTT TATGCTCAGG 4080 

45 AGCATTGGAA TCCTCTTCTT CCAGGGAGGA ATTAGCCTGC AAGGTTAGGA CTTGAAGAGG 4140 

GAAGGTATTT AATAACTGGG CGAGGATGGG TGTGGTGGCT CACACCTGTA ATCCCAGCAT 4200 

TTTGGGAGGC TGAGGTGGCC AGATCCCAAG GTCAGAAGAT CGAGACCATC CTGGCTAACA 4260 

TGGTGAAACC CCATCTCTAC TAAAAATACA AAATTAAATT GGCCGGGCGT GAA 4313 



15 



WO 98 54963 HCT X S98 1 1422 



39 S 



( C ! STRANDEDNES 5 : doub 1 e 
ID) TOPOLOGY: linear 

;xi) SEQUENCE DESCRIPTION : SEQ ID NO: 147: 

5 

GGCAGAGCCT CAAGCTGACT TGGATTATGT GGTCCCTCAA ATCTACCGAC ACATGCAGGA 63 

GGAGTTCCGG GGCCGGTTAG AGAGGACCAA ATCTCAGGGT CCCCTGACTG TGGCTGCTTA 120 

10 TCAXWYGGGG AGTGTCTACT CAGCTGCTAT GGTCACAGCC CTCACCCTGT TGGCCTTCCC 13 C 

ACTTCTGCTG TTGCATGCGG AGCGCATCAG CCTTGTGTTC CTGCTTCTGT TTCTGCAGAG 24 C 

CrrCCTTCTC CTACATCTGC TTGCTGCTGG GATACCCGTC ACCACCCCTG GTCCTTTTAC 3 00 

TGTGCCATGG CAGGCAGTCT CGGCTTGGGC CCTCATGGCC ACACAGACCT TCTACTCCAC 360 

AGGCCACCAG CCTGTCTTTC CAGCCATCCA TTGGCATGCA GCCTTCGTGG GATTCCCAGA 420 

GGGTCATGGC TCCTGTACTT GGCTGCCTGC TTT'GCTAGTG GGAGC~AACA CCTTTGCCTC 4 B0 

CCACCTCCTC TTTGCAGTAG GTTGCCCACT GCTCCTGCTC TGGCCTTTCC TGTGTGAGAG 540 

TCAAGGGCTG CGGAAGAGAC AGCAGCCCCC AGGGAATGAA GCTGATGCCA GAGTCAGACC 600 

CGAGGAGGAA GAGGAGCCAC TGATGGAGAT GCGGCTCCGG GATGCGCCTC AGCACTTCTA 660 

TGCAGCACTG CTGCAGCTGG GCCTCAAGTA CCTCTTTATC CTTGGTATTC AGATTCTGGC 720 

30 CTGTGCCTTG GCAGCCTCCA TCCTTCGCAG GCATCTCATG GTCTGGAAAG TGTTTGCCCC 780 

TAAGTTCATA TTTGAGGCTG TGGGCTTCAT TGTGAGCAGC GTGGGACTTC TCCTGGGCAT 840 

AGCTTTGGTG ATGAGAGTGG ATGGTGCTGT GA'GCTCCTGG TTCAGGCAGC TATTTCTGGC 900 

35 

CCAGCAGAGG TAGCCTAGTC TGTGATTACT GGCACTTGGC TACAGAGAGT GCTGGAGAAC 960 

AGTGTAGCCT GGCCTGTACA GGTACTGGAT GATCTGCAAG ACAGGCTCAG CCATACTCTT 102 0 

40 ACTATCATGC AGCCAGGGGC CGCTGACATC TANGACTTCA TTATTCWATR ATTCAGGACC 1080 

ACAGTGGAGT ATGATCCCTA ACTCCTGATT TGGATGCATC TGAGGGACAA GGGGGKCGGT 1140 

STCCGAAGTG GAATAAAATA GGCGGGCGTG GTGACTTGCA CCT 1183 

45 



25 



(2) INFORMATION FOR SEQ ID NO: 148: 

50 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 734 base pairs 
•3) TYPE: nucleic acid 
(C) STRANDEDNESS : double 
55 SD) TOPOLOGY : linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 14 3 



60 



GAATTCGGCA GAGTGAAGCA TTAGAATGAT TCCAACACTG CTCTTCTGCA CCATGAGACC 



60 



WO 98/54963 
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10 



20 



AACC CA3GGC 


AAGATCCCAT 


CCCATCACAT 


CAGCCTACCT 


CCJTCCTGGC 


TGCTGGCCAK 


12C 


GATGTCGCCA 


GCATTACCTT 


CCACTGCCTT 


TCTCCCTGGG 


AA 3CAGCACA 




130 


GGCACCAGGC 


CACCTCTGTT 


GGGACCCACA GGAAAGAGTG 




T>3CKTGGCTG 


240 


ACCTTTCTAT 


CTTCTCTAGG 


C7CAGGTACT 


GCTCCTCCAT 






300 


GAGAAGAAGC 


TCTCATAC3C 


CTTCCZACTC 


CC7CTGGTTT 


1 r.o^irtL ^ . 


AwTCCCTAGC 


360 


C AAC AGGAG A 


GGAGGCCTCC 


TGGGGTTTCC 


CCRRGGCAGT 


AGGTCAAACG 


ACCTCATCAC 


42 0 


AGTCTTCCTT 


CCTCTTCAAG 


CGTTTCATGT 


TGAACACA'GC 


TCTCTCCRCT 


CCCTTGTGAT 


480 


TTCTGAGGGT 


CACCACTGCC 


ARCCTCAGGC 


AACATAGAGA 


GC CTCCTGTT 


CTTTCTATGC 


540 


TTGGTCTGAC 


TGAGCCTAAA 


GTTGAGAAAA 


TGGGTGCCAA 


GGCC^GTGCC 


AGTGTCTTGG 


600 


GGCCCCTTTG GCTCTCCCTC ACTCTCTGAG GCTCCAGCTG 


GTCCTGGGAC 


ATGCAGCCAG 


660 


GACTGTGAGT 


CTGGGCASGT 


CCAAGGCCTG 


CACCTTCAAG AAGTGGAATA AATGTGGCCT 


720 


TTGCTTCTAT 


TTAA 










734 



25 



(2) INFORMATION FOR SEQ ID NO: 149: 

30 (i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 1405 base pairs 

(B) TYPE: nucleic acid 
[CJ STRANDEDNESS : double 
(D) TOPOLOGY: linear 

35 

Cxi) SEQUENCE DESCRIPTION: SEQ ID NO: 149: 





GGCACAGTGG ACCCCAGACT CCCTCTCCGC CTTTCTCTGC CTGGGGAGAC 


CCACTGTGTG 


60 


40 


CATGGCATCA CTGACTCCCA TACCTCTGGC TATCAAAGGT TTCTGCCATG 


GCCACCCTGG 


120 




AAGSAAACCA GAGGGAGGTA GACAGGGAGA TCAGGTCCCT TCTACTCTGG 


TTCCTGCTCT 


180 




GTGAAATTGT CTCAGGCTGG CTGTGTCCAG ARGGTCCCTG GTTCTCTCAR 


GGATGCCAAA 


240 


45 






300 




TCTACAAGAA TCTCTCCTCT TCCAGTTCCT ATAACCTCTC CTTCCTTTTG 


TCTCTTTAGA 




CCTTGGAGTA GTAGCAGCCA GGTTCTTTCT ATCTCTGGGT TAGTGCATTA 


TCTCTGGTGG 


360 


50 


CTCCCTTACC CAGGACTTTG GGAATGGTCT TTTTGTAATA CATTCTCCTC 


AAATAATTCA 


420 




ATTTTGAGTG TTCTGTATGT ATCCTGCTGG GAGGTTGTTA TATACAAATC 


ACTGTGCCCG 


480 




TTTAGCAGAG AAGGAGACTG AAGCTCAGGG AGGTTAAGTG TVTT^X-TCTA 


GGTCGTATTG 


54 0 


55 










TGGAGAAAGT GGCTGACTGG GGACTTGAAT GAGGTCCCTA GTTTCATGCT 


CGGAGGGC AA 


600 




AGANGAATGT CO^TTGGCC TGAGATAAGC CTCTGGTAAA ATGTACTGTA CATAATAGGT 


660 


60 


AATCAATAAA TGTTGGCTGA TGACAAACAT GTTTTCTTTG TTCATTAGTT 


ATAGTGATTA 


720 



WO 98/54963 H(T l S98/1 1422 



4()(> 



TGTTCTAAAT AACTCCMACA AGGAARTCAG CACATTTGGA ATATCA'/JTAT CTTTC CATC- A 78C 

TAATATCTTT CCMYGGAAAG AWAATGATAT TCCMAACTGG GAGTGTCCCW A.GCARATCTG 840 

5 

AMTCTGTGTA TTGGCCCTGG GGTGG3CCAG CCCCTTAGAC TCTATGGTCT CATTCTCTTT 900 

GTTTACAAAA TTGAGATAAG GCCTTATTCT CTCCCCACCC CACCCATCCA T ATTGTTTTG 960 

10 AGAATAAAAT GAGAGGATGT GTGTC.AAGGG TGTATTTTGG CAATAGTCTC TGAGCCATTT 102C 

TCTGAGCACC TCCATACTGT TGACACTCAA GTAATATTTC ATCAGCATTC CATTCAGGCT 1080 

CCTCCCTTAA TGAGGTGTGC GATGTACAAG AGTYGTGAGG TGGCAAAGGA TGGGCTCCTG 1140 

AGGAAACACT TAGGAAACTG GGCTTTCTGC CATTAAAAGA GACAAACCTT TGTGGTGACC 1200 

TAATTAAAGT TTTTAAAATT CAATTTGGAA AGTTAGCAAG CTAGCTCCTK TCCAGGWAAA 1260 

20 ATAAGGAGTC AGTGCATGAC CTAACCGCTC CCGGGCTGCT TGCCATTC CA AACAACTGCA 1320 

GTAAGTTTAT CACNTTCTTT CAGGGACTGA GGTTTCCAGG CACAGACTTG GATAAGGAAG 1380 

GATGTCCTAT GGGGTCACAT TGATG 1405 

25 



15 



30 



40 



50 



(2) INFORMATION FOR SEQ ID NO: 150: 



(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH : 2890 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : double 
35 ( D) TOPOLOGY: linear 



(Xi) SEQUENCE DESCRIPTION: SEQ ID NO: 150: 

TTATATGCTA CAGCTACAGT AATTTCTTCT CCAAGCACAG AGGANCTTTC CCAGGATCAG 60 

GGGGATCGCG CGTCACTTGA TGCTGCTGAC AGTGGTCGTG GGAGCTGGAC GTCATGCTCA 120 

AGTGGCTCCC ATGATAATAT ACAGAC GATC CAGCACCAGA GAAGCTGGGA a-.CTCTTCCA 180 

45 TTCGGGCATA CTCACTTTGA TTATTCAGGG GATCCTGCAG GTTTATGGGC ATCAAGCAGC 240 

CATATGGACC AAATTATGTT TTCTGATCAT AGCACAAAGT ATAACAGGCA AAATCAAAGT 300 

AGAGAGAGCC TTGAACAAGC CCAGTCCCGA GCAAGCTGGG CGTCTTCCAC AGGTTACTGG 360 

GGAGAAGACT CAGAAGGTGA CACAGGCACA ATAAAGCGGA GGGGTGGAAA GGATGTTTCC 420 

ATTGAAGC CG AAAGCAGTAG CCTAACGTCT GTGACTACGG AAGAAACCAA GCCTGTCCCC 480 

55 ATGCCTGCCC ACATAGCTGT GGCATCAAGT ACTACAAAGG GGCTCATTGC ACGAAAGGAG 540 

GGCAGGTATC GAGAGCCCCC GCCCACCCCT CCCGGCTACA TTGGAATTCC CA7TACTGAC 6CC 

TTTCCAGAAG GGCACTCCCA TCCAGCCAGG AAACCGCCGG ACTACAACGT GGCCCTTCAG 660 

60 



WO 98/54963 



'CT/US98/11422 



10 



-01 



AGATCGCGGA TGGTCGCACG ATCCTCCGAC ACAGC7GGGC CTTCATCCGT AC AGO AGC CA 720 

CATGGGCA7C CCACCAGCAG CAGGCCTGTG AAC AAACCTC AGTGGCATAA AY CGAACG AG 7 30 

TCTGACCCGC GCCTCGCCCC YTATCAGTCC CAAGGGTTTT CCACCGAGGA GGATGAAGA? 84 0 

GAACAAGTTT CTGCTGTTTG AGGCACAGAC TTTTCTGGAA GCAGAGCGAG CCACCTGAAA 900 

GGAGAGCAGA AGAAGACGTC CTGAGCATTG GAGCCTTGGA ACTCACATTC TGAGGACGGT 960 

GGACCAGTTT GCCTCCTTCC CTGCCTTAAA AGCAGCATGG GGSTTCTTCT CCCCGTCTTC 1020 

CTTTCGCCTT TGCATGTGAA ATACTGTGAA GAAATTGCCC TGGCACTTTT CAGACTTTGT 1080 

15 TGCTTGAAAT GCACA ; 3TGCA GCAATCTTCG AGCTCCCACT GTTGCTGCCT GCCACATCAC 1140 

ACAGTATCAT TCCAAATTCC AAGATCATCA CAACAAGATG ATTGACTCTG GCTGCACTTC 1200 

TCAATGCCTG GAAGGATTTT TTTTAATCTT CCTTTTAGAT TTCAATCCAG TCCTAGCACT 1260 

20 

TGATCTCATT GGGATAATGA GAAAAGCTAG CCATTGAACT ACTTGGGGCC TTTAACCCAC 1320 

CAAGGAAGAC AAAGAAAAAC AATGAAATCC TTTGAGTACA GTGCTTGTCC ACTTGTTTAC 1380 

25 AATGTCCTCC TTTTAAAAAA AAAAAAATGA GTTTAAAGAT TTTGTTCAGA GAGTAAATAT 1440 

ATATCCATTT AATGATTACA GTATTATTTT AAACCTTAAG TAGGGTTGCG AGC'CTGGTTT 1500 

CTGAAAAACC AAATATGCCG GACAGGGTGT GGCCACACCA AGAAGACGGG AAGACCTGGC 1560 

30 

TTGTGACCCT GGCTTCCCAT GTCCTTCTGG TCTGACCCGC GAAGTGCCCT ATCCTGGAAG 1620 

TATGAAATGT TAGCCAATTA ATACCAAGAC ACCTCATCTG CTCCTTCCCC AGTGGATGGG 1680 

35 GTTCTTCTGT AAAACTGTTT GCACATGGCC AGGGGAGGGA ACTAGGACCC TTGTGTCCTG 1740 

TCTGAGCCTT ATGGAGGCAG GACGGTGTCA TTGGCGGATG TGTCCTGCTC CATTGAGATG 1800 

GATGGCAAAC CCCATTTTTA AGTTATATTT CTTTGATTTT TGTTAATTTA GAGGTGTAGG 1860 

40 

TTTTGTTTTT TGTTTTTTTG TTTTTTTTTA AGAGAAACAT TTATAACTGG ATAGGATTGC 1920 

AGTGAAAGCA GCTTGGGATG TTGGAGCTAA TGCCAGCTGT TTATACTGCT CTTTCAAGAC 1980 

45 AGCCTCCCTT TATTGAATTG GCATTAGGGA ATAAACAAGC CTTTAAACGT GATAAAAGAT 2040 

CAAAAACCTG GTTAGACATG CCAGCCTTTG CAAGGCAGGT TAGTCACCAA AGACTAACCT 2100 

CCAAGTGGCT TTATGGACGC TGCATATAGA GAAGGCCTAA GTGTAGCAAC CATCTGCTCA 2160 

50 

CAGCTGCTAT TAACCCTATA ATGACTGAAA TGACCCCTCC ACTCTATTTT TGTGTTGTTT 2220 

TGCACAGACT CCGGAAAAGT GAAGGCTGCC AATCTGAGTA GTACTCAAAT GTGAGGAACT 228C 

55 GCTGGTCTTG GATTTTTTTT CCATTAAATT CAGCTGATCA TATTGATCAG TAGATAAACG 2340 

TAAATAGCTT CAAATTTTAA AAGTGGAATT GCAGTGTTTT TTCACTGTAT GAAACAATGT 2400 

CAGTGCTTTA TTTAATAATT CTCTTCTGTA TCATGGCATT TGTCTACTTG CTTATTACAT 2450 



60 
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■CT/l S98/ 11422 



TGTCAATTAT GCATTTGTAA TTTTACATGT AATATGCATT ATTTGCCAGT TTTATTATAT 2 52 0 

AGGCTATGGA C CTCATGTGC ATATAGAAAG ACAGAAATCT AGCTCTACCA CAAGTTGCAC 2530 

5 AAATGTTATC TAAGCATTAA GTAATTGTAG AACATAGGAC TGCTAATCTC AGTTC GCTCT 2640 

GTG ATGTC AA GTGCAGAATG TAGAATTAAC TGGTGATTTC CTCATACTTT TGATACTACT 2700 

TGTACCTGTA TGTCTTTTAG AAAGAGATTG GTGGAGTCTG TATCCCTTTT GTATTTTTAA 2760 



10 





TACAATAATT GTACATATTG GTTATATTTT TGTTGAAGAP GGTAGAAATG 


TACTATGTTT 


2320 




ATGCTTCTAC ATCCAGTTTG TACAAGCTGG AAAAT AAAT A AATATAACAT 


AAAAAAAAAA 


2880 


15 


AAAAAAAAAA 




289C 


20 


(2) I MFORMAT I ON FOR SEQ ID NO: 151: 






25 


(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 23 99 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : double 

(D) TOPOLOGY : linear 

(xi) SEQUENCE DESCRIPTION : SEQ ID NO: 151: 








GAACTTTTCC ATCTGGCAAA CCGGAAACTC CATCCCCATT AAACCAACTC 


CCCCTTTTGG 


60 




TTTCCCCCCC AGNGGAATAG AATTTGGACN CC CAT AT AAA TCCAGGAAAC 


CACCTAAATT 


120 


35 


CTTTAGTNGT TTGTGTTTGC AAGATCTAAG GTCATGGTAA ACATTAAGTT CTTAAAATTT 
TTGGGAGGGA CCAGTGCACC TCTCCCTCTG AATTGTTCNC CAATTTAAAA TTGGAGTAAG 


180 
240 




GTTTT AAAAT GTCTNATTCC ATTGGAAGGG TNTGTTATTT CATTTTGAGC 


CCAGAGGGGA 


300 


40 


GAGGCACATT TTAAATATCA GAATTAGATT AGCTTTGAGT TTGTACAATT 


GGGAACATAA 


360 




TAGATTTTCA TAAATTATGT GTGCCTTGTT GGAAGTGTCA ACTGTCTTTA 


TGTCTGCTTG 


420 


45 


TAAAAGTTTC AAAAT ATGTT TTCCCTCAAA AAGGCAACGT TACTTCATTT 
TATGATAGGA ATGCTTACTG ATATTACTTG ATAGTCATAT ATAGCCTAGG 


GCTTGAATAT 
AAATTTAACA 


480 
540 




TATATATAAC TATAGCAGTA TTAATAATGA TAGTTGTACT TCTTTAAAAC 


ATTAAATTTG 


600 


50 


AGGAAACTTT AATGCTGTCT CGTGTACATT GCTTTACTAC AGTGAGGGGG 


AATATCCTTT 


660 




AGATTGAGCC TCAATTTACT GGTTAGTAGT ATGTGAACTC TGGTATAAAA 


ACGTAAACTA 


720 


55 


GACAGTAGAG CCGATGAATT AAAATTGTAA ATTGCTACAT TGGCATTTTC 
TCTGTCAGAG TATTACTTTT TCCAGCATTT ATTCTTATTT GTGAGTAAAG 


TACCTCCTTT 
AGGAAATGGG 


780 
840 




AACCTGAGGT TAAAATTGAC ATTTTTGTTT CATTGAGAAT TTAAGCAGTA 


GGTACAGGAG 


900 


60 


AAGTGACTTG TCACATTAAT TTGGTGCCTA AATCTGTAAC TACAAGTTGT 


GATCGACATG 


960 



W ( > 98/54963 SSgm S$Em F s - S98 2 1 422 

403 



TACAAAATG? CTAAGAAAGG TCATATGCTG AATATTTTAC TTTTCCTGTA TAGTCTGCAT 102 3 

GATTTGTTTC ATAAACCCAG CTTATTTCCT CCAAAAA jC A AAATGGTCCT GTAATTTTTA 138 3 

5 

AAGTAAAATA AACGTGCCAT TTTGTCTGCA ATCTATAATT TCAGGAAGTT ATTGRAAGTT 1140 

CTGACTCAGG GCTTTTTAAC AGTTCAAGCA ATTGTCA 3 TT ATATTTTGGA AACTCCATCT 120C 

10 GTGTAATTCT CCAGTGCCTT GAAAGAATTA TTAACTTGGC AAC ACTATT A AAACTTTATA 12 60 

AAAGATGGTC TTTAGTGCAC GTGTATCATT AT AT AC A 3 GT TTTAAAGTCA TATTGCTTAG 1320 

CTTGTTAATA ATGATTCTGC ATGTGTGCTG GGTTTGGGTA ATTCTTT AAA GGAAGTTTTC 13 30 

15 

TAGATTTGCA CTTGATGTTT GTTTTTTAAA AACTGATTAT TTATGGCCGT GACACTGTTA 1440 

CCAGAAAAGT AATTCTAATT AAGTTATTAT GCAAAGTCAT CTATAAGTAG CATCTGGGAA 1500 

20 G AGG AG AT SG AGGCCACAGT TTGCTATTTT AGTATGAAAG GAGGATCTGT TTGG3AAACA 1560 

TAGATTGTCT TCCCCTCAAA TGAGGGGAAA AAAAAAGACC CTTTGTTCAA ATGGATTCTG 1620 

TTGTAAAAAA TTATTTTTAA AGGAAATCAC AAATTGTATG TCATTCTTAA TGCTAGTCTT 1680 

25 

ATAGAATAAA TCCATAAAAT TGTTTTTATG TTCAGTATGT TTATGTCATT CTAAATGCAG 1740 

CAAATTCAAT GATAGCAGTT CAATTGACTC ATAGCAGTGT TTTGTATTTT TTCTAATTCT 1800 

30 TTAGCTTTCA ATATTGGATT AAAGTCTTGT TTGTGAATAT AGTTTCCGTA TGGCAAATGA 1860 

TTTCTTGCTT ATTAGCTTTT GTTAAAGAAT GCTTAGTAAG AGC TAAGCTT TTAAAAGTAA 1920 

TGCAAACATT TATCGTTAAT AAAACCTATG GTGTAATATC ATATAATGCT TTTCTTTGAT 1980 

35 

CTTTGGAGAA TTATTCTTTT ATAGTAGTAT ACATGAATTT TGATTTTTAA AGCATTTAAA 2040 

AACAAATCTC AATACATTAA AAAACCTGTT ATTGTTAAAA RGGAAATTAC CATGCCTTTA 2100 

40 AGAAACAAGG ATGTACATCT TCAATTCAGC ATRAGTGTCC ACATCTAGAA GGCTCTCATT 2160 

GCAGTTGTTT ACAGTTAAGG TACCTCTATC TAAAGGGCCA AAGAAGCATT TCATAYTTTA 2220 

ACACCTCACA TTCTTTCAGG ATTAAGACAT ATGAAAATAG TCTGAATAGG ATAAATTTGG 228 0 

45 

ATAGGAAGTA ACTTAACCAG TCTGGGAAGA TTCAGGCTTT TTCTATKAAA AAGCTTATTC 2340 

CTCTTCACAA CTCNGGTGGT AGGTJTTTCAT TTTTCAAGAG GGTAGATATT TTAAAGCCA 2399 

50 



£2) FORMATION FOR SEQ ID NC: 152: 

55 (i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 8 02 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : double 

(D) TOPOLOGY: linear 

60 



WO 98/54963 



'CT/US98/I1422 



10 
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l XI) SEQUENCE DESCRIPTION: SEQ ID NO : 152: 
CGTGCCTGTA GTAAGCTCAT CCCTGCCTTT GAGATGGTGA TGCGTGCCAA GGACAATGTT 5 0 

TACCACCTGG AGTGCTTTGC ATGTCA-3CTT TGTAATCAGA GATTNTGTGT TGGAGACAAA 12 0 

TTTTTCCTAA AGAATAAOTT GAY CCTTTGC CARACGGACT ACGAGGAAGG TTTAATGAAA 180 

GAAGGTTATG CACCCOJGGT TCGCTGATCT ATCAACATCA CCCCATTAAG AATACAAAGC 240 

ACTACATTCT TTTATCTTTT TTGCTCCACA TGTACATAAG AATTGACACA GGAACCTACT 3 CO 

GAATAGCGTA GATATAGGAA GGCAGGATGG TTATATGGAA TAAAAGGCGG ACTGCATCTG 36 3 

15 TATGTAGTGA AATTGCCCCA GTTCAGAGTT GAATGTTTAT TATTAAAGAA AAAAGTAATG 42) 

TACATATGGC TGGATTTTTT TGCTTGCTAT TCGTTTTTGT GTCACTTGGC ATGAGATGTT 480 

TATTTTGGAC TATTGTATAT AATGTATTGT AATATTTGAA GCACAAATGT AAT AC AGTTT 54 0 

20 

TATTGTGTTA CCATTTGTGT TCCATTTGCT YCTTTGTATT GTTGCATTTA GTACAATCAG 60 3 

TGTTTAAACT TACTGTATAT TTATGCTTTC TGTATTTACC AGCTATTTTA AATGAGCTGT 660 

25 AACTTTCTAG TAAAGAATTG AAAAGCAAAT CCTCACTAAA GG AT AC AC AG GATAGGATAA 720 

AGCCAAGTCN CATCAACATT AAAAAATACT AAAANANAAA ACACAAAAAA AAAAAANCCC 780 
GGGGGGGGCC CGGAACCCAT TC 

30 



35 



45 



55 



(2) INFORMATION FOR SEQ ID NO: 153: 



802 



(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: -461 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : double 
40 (D) TOPOLOGY : linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 153: 

CTAGGAGCAC CGAGCAGCTT GGCTAAAAGT AAGGGTGTCG TGCTGATGGC CCTGTGCGCA 60 

CTGACCCGCG CTCTGCNCTC TCTGAACCTG GCGCCCCCGA CCGTCGCCGC CCCTGCCCCG 120 

AGTCTGTTCC CCGCCGCCCA GATGATGAAC AATGGCCTCC TCCAACAGCC CTCTGCCTTG 

50 ATGTTGCTCC CCTGCCGCCC AGTTCTTACT TCTGTGGCCC TTAATGCCAA CTTTGTGTCC 

TGGAAGAGTC GTACCAAGTA CACCATTACA CCAGTGAAGA TGAGGAAGTC TGGGGGCCGA 300 

GACCACACAG GTGGGAACAA GGACAGGGGG ATTTAAGCAG TCAAAAGGAA AAACATGTTA 360 

AGACCCTAGA CTTGTATATT GACACACTTG TACCTTGTAA GGCAGAGGAA TGTAATTAAA 420 

AAGCACTTAT TTGGCWNAAA AAAAAAAAAA AAAAAAAAAA C 461 



180 
240 



60 



WO 98/54963 ^BCT I S98 1 1422 



405 





(2) INFORMATION FOR SEQ ID NC: 154: 






5 

10 


li) SEQUENCE CKAFAC7ERI ST ICS : 

(A) LENGTH: 2 3 38 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : double 

(D) TOPOLOGY: linear 

!xi) SEQUENCE DESCRIPTION: SEQ ID NO: 154: 








GCCCACGCGT CCGAAAGCGG AGAACGCTGG TGGGCCTGTT GTGGAGTACG 


CTTTGGACTG 


6C 


15 


AGAAGCATCG AGGCT AT AGG ACGCAGCTGT TGCCATGACG GCCCAGGGGG 


GCTGGTGGCT 


120 




AACCGAGGCC GGCGCTTCAA GTGGGCCATT GAGCTAAGCG GGCCTGGAGG 


AGGCAGCAGG 


180 


20 


GGTCGAAGTG ACCGGGGCAG TGGCCAGGGA GACTCGCTCT ACCCAGTCGG 
AAGCAAGTGC CTGATACCAG CGTGCAAGAG ACAGACCGGA TCCTGGTGGA 


TTACTTGGAC 
GAAGCGCTGC 


240 
300 




TGGGACATCG CCTTGGGTCC CCTCAAACAG ATTCCCATGA ATCTCTTCAT 


CATGTACATG 


360 


25 


GCAGGCAATA CTATCTCCAT CTTCCCTACT ATGATGGTGT GTATGATGGC 


CTGGCGACCC 


42 0 




ATTCAGGCAC TTATGGCCAT TTCAGCCACT TTCAAGATGT TAGAAAGTTC 


AAGCCAGAAG 


480 


30 


TTTCTTCAGG GTTTGGTCTA TCTCATTGGG AACCTGATGG GTTTGGCATT 
AAGTGCCAGT CCATGGGACT GTTACCTACA CATGCATCGG ATTGGTTAGC 


GGCTGTTTAC 
CTTCATTGAG 


540 
600 




CCCCCTGAGA GAATGGAGTT CAGTGGTGGA GGACTGCTTT TGTGAACATG 


AGAAAGCAGC 


660 


35 


GCCTGGTCCC TATGTATTTG GGTCTTATTT ACATCCTTCT TTAAGCCCAG 


TGGCTCCTCA 


720 




GCATACTCTT AAACTAATCA CTTATGTTAA AAAGAACCAA AAGACTCTTT 


TCTCCATGGT 


780 


40 


GGGGTGACAG GTCCTAGAAG GACAATGTGC ATATTACGAC AAACACAAAG 
ATAACCCAAG GCTGAAAATA ATGTAGAAAA CTTTATTTTT GTTTCCAGTA 


AAACTATACC 
CAGAGCAAAA 


840 
900 




CAACAACAAA AAAACATAAC TATGTAAACA AGAGAATAAC TGCTGCTAAA TCAAGAACTG 


960 


45 


TTGCAGCATC TCCTTTCAAT AAATTAAATG GTTGAGAACA ATGCATAAAA 


AAAGTTGCAC 


1020 




AAGTTCCTTA TTTTCCTTAA TATTTCACTT CTATTTAATA CAAGCTGGGA 


CATAAAAATT 


1080 


50 


CTGTTGGGGA TACCTGGGGG AAGATGTGAG AAACTAATGC TGAATTCAGC 
TGAAAAGAAA AACCAGACAA AAGGAGCACA TAAATATGCA TACAGTGTAA 


TTATACATGA 
CTGTTATTAT 


1140 

1200 




TTTAATACCC ACGATAAGGG AITL^TGTTA GCATGTTTAG GGGGAACGAG 


GATTGGTGGG 


1260 


55 


ATCCTTGGGG CCACAGGAAT CTGAGGCAAC GGAAGATATA TAGACOGATC 


GTCCCCCTGC 


1320 




CGAAGGAACC TGGCAYCTGT CAAGCAGATG CTGCAGTTCA AACTTCAGCT 


TTTAAGATAG 


1380 




ATAGCTATTG AAGGCAGAGG GTCAGCAGGA GGATGTGTAT TTCTAATCTA 


CCCTGGTAAA 


1443 



WO 98/54963 
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406 

GTCATAGGTA AGACTCAAAA GCGGGATCTT ATTCAAAAGG CAGGTA7TTC CTT7GTTTTC 15 CO 

TGTCTTGAAA TAGCCCCTTC CCCTAAGGTG CATTCTCTCA AGTTTTCAGT ATTGCTTTAT 156 0 

TTGCAGTGAT TAAAAGAGAT GAGAGACTTT GGAGACAGAC AACGTAAGCA ACACATACAC 162 0 

ACATGAAATA CTCTAGACAG AGATGAATAT AAATCTGGCC TAATAACCAG TTTTCCATGT 1630 

AACAGTGATT TTGTGTTTCG GGCTGAA.GCA GTGGTTATAT TAAAAGCCAC T AATTC CCTT 1740 

ATCCCTTTAA AAGATTTTTA CAATTCTCCA ACCACAAACA GCACTTCTAA AACTAACTTT 18CC 

ACTTTCTGCC CATAATTTGT TCTACATGGA AAAAAAAAAT ATT AC TTTGG CCAGGGGTGT 1860 

15 GTGTAAATGT GGC AG AATTC CTAGGCAGGC TGACCTTTAC AGTATGGGCC TTTAAGATAC 1920 

TGGATCCTGG TTGGGCAACA AGTGT C ACGC CTGAAGTTTC TGAAAACAAA TTAGAAGACT 1980 

GTTGGCTTGG CTAATCTCGT AGTTCAGGGC CAAGTTTCTG TAGTCAGAAT GAAGAATAAA 2040 

20 

ATTGAAAGAA AAAGGGGGAA ATGCTTATAC TTGGCATTAA GTTGAATGCC TCAAGTCTTA 2100 

ACTATGGCTT TGTAGATGAG GCAAAAGATT TCTTAGTGGT AAAATTTCTT CAACAGGTCA 2160 

25 ATGCCAATCT GTATGCCATT TTAGTAAAGT AGGTAAGGAG AGTAGCCGCT CAGTAACTTT 2220 

GGCACTAAAG AAAGAGTGTG GCTCTAGAAC TTCCAATCCC ATTGCTAGAT GTGCCCTTTA 2280 

AAAGATGGTC CAGTGCTTTC AGGGAAGGAT GTTTAGCCAG TTTTCCTAGT ATTTGTTCCT 234 0 
TAAGATTTTT TGACCTGTGC TTAATAAGAC GGACGCGTGG GTCGACCC 2388 



30 



35 



45 



(2) INFORMATION FOR SEQ ID NO: 155: 



(i) SEQUENCE CHARACTERISTICS : 

(A) LENGTH: 642 base pairs 
40 (B) TYPE: nucleic acid 

(C) STRANDEDNESS: double 

(D) TOPOLOGY: linear 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 155: 

AAAACAGACC ATTTAAAAAC TCAGACAAGA TTATATTTAA TATATTAATT ACTAAAAAGG 60 

CACAAGATTA CACTGAACAT ATTAGCTACT AAAAAGGCAC TGCTAAGACA TTCAAGCAAA 120 

50 TAGCTATTAC ACACTACTGC AGATTTTACA GGTTTCTAAT TCTAACATAT GTTTGAAAAA 180 

TCCGTGAGTA TTCCAAAATA TATTTAATAA TGGAATATCT GCATTAATAT ACCATCCATG 240 

T3TTTTTACC ATTTGCCTTA ATATTGAATA TACTGTTTAC CTCACACTAA AAAGAAAACC 300 

55 

AGAAGCCTTA TTTGTGATTT TGGGAGTGGA AGCTTCCATT TTTGTGTCAA AAATGAATCC 360 

TGATTCTTAT GGAAATCTCT GTTATTAAGA TATTTCAAGA TGAGACAACA CTGAAGATCA ^420 

60 AATTGTGTTT AGTATCACTA TCrTCTCTCC TCCCTTCTCT CTTACTCCTC ATCCTCCCAG 480 



WO 98 54963 




AATCTACCAG TTTATGGTAG AAAGATGGGA AC CTTATTTG AATGTGTTTT TTTTTTTCCA 540 

TGATGTCCAA TTTTGTTGTG GGAAAGGATT TGGATAAAAT TTTTGTTTAA ATTTTGGTAG 600 

5 

ATTTTTATCT ATACAAATTT AAATAAAATT ATGTTTTGTA AG 642 



10 

(2) INFORMATION FOR SEQ ID NO: 156: 

( l ) SEQUENCE CHARACTERISTICS : 

(A) LENGTH: 1251 base pairs 
15 (B) TYPE: nucleic acid 

(C) STRANDEDNESS : double 

(D) TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 156: 

20 

GCCGCTGCCC CTCCACGGAG TTGCTGATCA TCTGGGCTGT GATCCACAAA CCCGGTTCTT 60 

TGTCCCTCCT AATATCAAAC AGTGGATTGC CTTGCTGCAG AGGGGAAACT GCACGTTTAA 120 

25 AGAGAAAATA TCACGGGCCG CTTTC C AC AA TGCAGTTGCT GTAGTCATCT ACAATAATAA 18 0 

ATCCAAAGAG GAGCCAGTTA CCATGACTCA TCCAGGCACT GAGCATATTA TTGCTGTCAT 240 

GATAACAGAA TTGAGGGGTA AGGATATTTT GAGTTATCTG GAGAAAAACA TCTCTGTACA 300 

30 

AATGACAATA GCTGTTGGAA CTCGAATGCC ACCGAAGAAC TTCAGCCGTG GCTCTCTAGT 360 

CTTCGTGTCA ATATCCTTTA TTGTTTTGAT GATTATTTCT TCAGCATGGC TCATATTCTA 42 0 

35 CTTCATTCAG AAGATCAGGT ACACAAATGC ACGCGACAGG AACCAGCGTC GTCTCGGAGA 48 0 

TGCAGCCAAG AAAGC CATC A GTAAATTGAC AACCAGGACA GTAAAGAAGG GTGACAAGGA 540 

AACTGACCCA GACTTTGATC ATTGTGCAGT CTGCATAGAG AGCTATAAGC AGAATGATGT 600 

40 

CGTCCGAATT CTCCCCTGCA AGCATGTTTT CCACAAATCC TGCGTGGATC CCTGGCTTAG 660 

TGAACATTGT ACCTGTCCTA TGTGCAAACT TAATATATTG AAGGCCCTGG GAATTGTGCC 720 

45 GAATTTGCCA TGTACTGATA ACGTAGCATT CGATATGGAA AGGCTCACCA GAACCCAAGC 780 

TGTTAACCGA AGATC AGCCC TCGGCGACCT CGCCGGCGAC AACTCCCTTG GCCTTGAGCC 840 

ACTTCGAACT TCGGGGATCT CACCTCTTCC TCAGGATGGG GAGCTCACTC CGAGAACAGG 900 

50 

AGAAATCAAC ATTGCAGTAA CAAAAGAATG GTTTATTATT GCCAGTTTTG GCCTCCTCAG 960 

TGCCCTCACA CTCTGCTACA TGATCATCAG AGCCACAGCT AGCTTGAATG CTAATGAGGT 1020 

55 AGAATGGTTT TGAAGAAGAA AAAACCTGCT TTCTGACTGA TTTTGCCTTG AAGGAAAAAA 1080 

GAACCTATTT TTGTGCATCA TTTACCAATC ATGCCACACA AGCATTTATT TTTAGTACAT 1140 

TTTATTTTTT CATAAAATTG CTAATGCCAA AGCTTTGTAT TAAAAGAAA.T AAATAATAAA 1200 

60 



WO 98/54963 



T/US98/11422 



40S 



ATAAAAAAAA AAAAACCC CO G3GGGGGCCC GGTCCCCAAT 7GGC_CTATC 



(2) INFORMATION FOR SEQ ID NO: 157: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 2127 base pairs 
10 (B) TYPE : nucleic acid 

(C) STRANDEDNESS : double 

(D) TOPOLOGY: linear 



15 



35 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 157: 

CCGGCGGGAG AGGGAAGCTG CAGCGAGAGG CGCGGATCTC AGCGCGGGAG CAGTGCTTCT 60 

GCGGCAGGCC CCTGAGGGAG GGAGCTGTCA GCCAGGGAAA ACCGAGAACA CCATCACCAT 120 

20 GACAACCAGT CACCAGCCTC AGGACAGATA CAAAGCTGTC TGGCTTATCT TCTTC ATGCT 130 

GGGTCTGGGA ACGCTGCTCO CGTGGAATTT TTTCATGACG GCCACTCAGT ATTTCACAAA 24 0 

CCGCCTGGAC ATGTCCCAGA ATGTGTCCTT GGTCACTGCT GAACTGAGCA AGGACGCCCA 300 

25 

GGCGTCAGCG CNC7CTGCAG CACCCTTGCC TGAGCGGAAC TCTCTCAGTG CCATCTTCAA 360 

CAATGTCATG ACC CTATGTG CCATGCTGCC CCTGCTGTTA TTCACCTACC TCAACTC CTT 420 

30 CCTGCATCAG AGGATCCCCC AGTCCGTACG GATCCTGGGC AGCCTGGTGG CCATCCTGCT 480 

GGTGTTTCTG ATCACTGCCA TCCTGGTGAA GGTGCAGCTG GATGCTCTGC CCTTCTTTGT 540 

CATCACCATG ATCAAGATCG TGCTCATTAA TTCATTTGGT GCCATCCTGC AGGGCAGCCT 600 

GTTTGGTCTG GCTGGCCTTC TGCCTGCCAG CTRACACGGC CCCCATCATG AGTGGCCAGG 660 

GCCTAGCAGG CTTCTTTGCC TCCGTGGCCA TGATCTGCGC TATTGCCAGT GGCTCGGAGC 720 

40 TATCAGAAAG TGCCTTCGGC TACTTTATCA CAGCCTGTGC TGTKATCATT TTGACCATCA 780 

TCTGTTACCT GGGCCTGCCC CGCCTGGAAT TCTACCGCTA CTACCAGCAG CTCAAGCTTG 840 

AAGGACCCGG GGAGCAGGAG ACCAAGTTGG ACCTCATTAG CAAAGGAGAG GAGCCAAGAG 900 

CAGGCAAAGA GGAATCTGGA GTTTCAGTCT CCAACTCTCA GCCCACCAAT GAAAGCCACT 960 

CTATCAAAGC CATCCTGAAA AATATCTCAG TCCTGGCTTT CTCTGTCTGC TTCATCTTCA 1020 

50 CTATCACCAT TGGGATGTTT CCAGCCGTGA CTGTTGAGGT CAAGTCCAGC ATCGCAGGCA 10 80 

GCAGCACCTG GGAACGTTAC TTCATTCCTG TGTCCTGTTT CTTGACTTTC AATATCTTTG 1140 

ACTGGTTGGG CCGGAGCCTC ACAGCTGTAT TCATGTGGCC TGGGAAGGAC AGCCGCTGGC 1200 

TGCCAAGCTG GNTGCTGGCC CGGCTGGTGT TTGTGCCACT GCTGCTGCTG TGCAACATTA 1260 

AGCCCCGCCG CTACCTGACT GTGGTCTTCG AGCACGATGC CTGGTTCATC TTCTTCATGG 1320 

60 CTGCCTTTGC CTTCTCCAAC GGCTACCTCG CCAGCCTCTG CATGTGCTTC GGGCCCAAGA 13 80 



45 



55 
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409 



AAGTGAAGCC A3CTGAGGCA GAGACCGCAG AGC2ATCATG GCCTTCTTCC TGTGTCTGGG 144 0 

TCTGGCACTG GGGGCTGTTT TCTCCTTCCT C7TTCCGGGCA ATTGTGTGAC AAAGGATGGA 1503 

5 

CAGAAGGACT GCCTGCCTGG CTCCCTGTCT GCrTCCTGCC CCTTCCTTCT GCCAGGGGTG 1560 

ATCCTGAGTG GTCTGGCGGT TTTTTCTTCT AACTGACTTC TGCTTTCCAC SGCGrTGTGCT 162 C 

10 GGGGCCGGAT CTCCAGGCCG TGGGGAGGGA GCCTCTGGAC GGACAGTGGG GACATTGTGG 1680 

GTTTGGGGCT CAGAGTCGAG GGACGGGGTG TAGCCTCGGC ATTTGCTTGA GTTTCTCCAC 1740 

TCTTGGCTCT GACTGATCCC TGCTTGTGCA GGC GAGTGG A GGCTCTTGG3 CTTGGAGAAC 1300 

15 

ACGTGTGTCT GTGTGTATGT GTCTGTGTGT CTGCGTCCGT GTCTGTCAGA CTGTCTGCCT 1860 

GTCCTGGGGT GGCTAGGAGC TGGGTCTGAC CGTTGTATGG TTTGACCTGA TATACTCCAT 1920 

20 TCTCCCCTGC GCCTCCTCCT CTGTGTTGTC TCCATGTCCC CCTCCCAACT CC CCATGCCC 1980 

AGTTCTTACC CATCATGCAC CCTGTACAGT TGCCACGTTA CTGCCTTTTT TAAAAATATA 2040 

TTTGACAGAA ACCAGGTGCC TTCAGAGGCT CTCTGATTTA AATAAACCTT TCTTGTTTTT 2100 

TTCTCCATGG AAAAAAAAAA AAAAAAA 2127 



25 



30 



40 



(2) INFORMATION FOR SEQ ID NO: 158: 



(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 1625 base pairs 
35 (B) TYPE: nucleic acid 

(C) STRANDEDNE S S : double 

(D) TOPOLOGY : linear 



(xi) SEQUENCE DESCRIPTION : SEQ ID NO : 158: 

CAAAAGATCT ATAATCAGGA CATTGTTTAT GTAAGTTGGA CAANAAAAAT TCTTCCCCTT 60 

TATGTCCACC CTTCCTATGA TTGCAAGACA AAATTTCCCT CCTTTACCTC ATCCCTATAA 120 

45 CATGGGAGGC TGAGAAAAAT GAGGGGAGAT GGAACCAGAT ACAAGGAGAT CCAATAAGAG 180 

AAGCTTATTT AAATATTGTG AAATAAAGGA AGAMCCAAAG CATTTTTTTA AGTGGGGAAT 24 0 

CCTTTTGAAC AGTTATTATT TATCCATATT ATTAAYAACA TCTTTTCTGA CAAAATCCAT 300 

50 

CAGATGAAGT GTAAATGGAT AATCTTTTAA TGGATCTAAA CCTAGAAAGT TTCACTTACT 36C 

GTTCATGTCC GTGTTCCAGA ATTGTGAAAT GGTGTGTGGT TTTCOTTTCC AAGTTCTTCT 42 0 

55 CTGCCTCCTC TTAATTCTCT AATTCCATGT CTTACAGAAG AATGAGAAAT TTCTTTCTTA 480 

CTTGAGTATC ATGCTCTAAA AAACTTGGCT TCAGTCACAG AAACGCTGGC TCTCCTGTGC 540 

TTATATTGAA GCCAACTGCC TTTAATTCTT GGGCCCTCTT ATATTTTTAA GGTGCAAAAT 600 

60 



WO 98/54963 
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410 

TTGAAGTCTC AGTCACCAGA CACA3GTTCT ATACAATTAA TGATGA3CTG GAGAA3TAAT 660 

ATGTAGCTAA TTTTTCAAAA GCATTGAATA TACTTTCCGG AAAGAAAACA GAAATTAAAT 72 C 

5 ATTGCCACAT CTTGCCAGAA TCCCATCTGA CACCTTAACT TTGTCAGGTT TC TTACAACT 78 0 

TGCTAATCAA GTTTTATACA TTCTAAATCT CCCCAGTTT 1 : TTTG3GGCTG GAAGATGCAA 84 C 

CTTCCATTTA ATAGAAACTT TGAAATCTTG GGGTAAGGGA GCAGTGGGGG GACTAGGGAG 900 

10 

AAGGATAAGA AATAGAATTA TTGAAAAGCC CCCACCAGGG ACCTTCCTGG CCAGAATATG ?'50 

CAGAGTAATT CCTGCTGGCT TCACCTTTGA AAGTCCCTCG AAACTATGCA GATGAAACTG 1020 

15 AGTCTGTTTT TGATATTGTC AGATGTATTC TACCTTGGAA GTCCCNACAC CTAAACTGGA 1080 

ATTCTTGTAT TTACATCTCC TCCACTGTCC CCCACACCAC CCCTCAATTC CTGCTGCCCC 1140 

TGCTAATGTT AAGCATTTTT CTCTTGTTAT CATCAGGTTC ACATTAAAAM CAGRTACTTA 1200 

20 

CAAACTGACT TGAAGCACAG ATACTTTTAC GAATGTGATA AAATATTTTC TTAAGAAAAG 12 hO 

GAAAGAGGAT GTGGGTCAAA TAAAACACCG CATGGATGTT GATTGGTGAA TACTGGTGTA 1320 

25 AGAAAAGGGA GCTCAGGAAT TTTTATTACT GTATTTGTAA ATGAGTTTGA AGGAATTTGT 1380 

AAATGCCACT GGTACATTTT TAAGGTGACA CATTTGCTCC TTATAAAGTT ATTAAAAATT 1440 

ACAGGGTAAG CTTAAATGAC GTTTGCCAGT AGTTTTACTT TATATAATCA ATATTGATAT 1500 

30 

TGTTGCTGAA CTATGTAACT TTATGATGCA TTTTTCAGTC CCTTTTCAGA GC AAATGCTT 1560 

TTGCAATGGT AGTAATGTTT AGTTTAAATT GACTTAATAA ATTTTTTACCT GAGCAAAAAA 1620 

35 AAAAA I 6 - 5 

40 (2) INFORMATION FOR SEQ ID NO: 159: 

(i) SEQUENCE CHARACTERISTICS : 

(A) LENGTH: 1687 base pairs 
(3) TYPE: nucleic acid 
45 (C) STRANDEDNESS : double 

(D) TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 159: 

50 CGGGGTCACC AGTTATTAGA GGAAGTAACA CAAGGGGATA TGAGTGCAGC AGACACATTT 60 

CTGTCCGATC TGCCAAGGGA TGATATCTAT GTGTCAGATG TTGAGGACGA CGGTGATGAC 120 

ACATCTCTGG ATAGTGACCT GGATCCAGAG GAGCTGGCAG GAGTCAGGGG ACATCAGGGT 130 

55 

CTAAGGGACC AAAAGCGTAT GCGACTTACT GAAGTGCAAG ATGATAAAGA GGAGGAGGAG 240 

GAGGAGAATC CACTGCTGGT ACCACTGGAG GAAAAGGCAG TACTGCAGGA AGAACAAGCC 300 

60 AACCTGTGGT TCTCAAAGGG CAGCTTTGCT GGGNATCGAG GACGATGCCG ATGAAGGCCC 360 
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TGGAGATCAG TCAGGCCCAG CTGTTATTTG AGAACCGGYG GAAGGGACGG CA'SCAGCAGC 42 J 

AGAAGCAGCA GCTGCCACAG ACACCCCCTT CCTGTTTGAA GACTGAGATA ATGTCTCCCC 43 C 

5 

TGTACCAAGA TGAAGCCCCT AAGGNAACAG AGGCTTCTTC GGGGACAGAA GCTGCCACTG 54: 

GCrTTGAAGG GGAAGAAAAG GATGGCATCT CAGACAGTGA TAGCAGTACT AC-CAKTGAGG 50 C 

10 AAGAAGAGAG CTGGGAACCC TCC3TGGTAA GAAGCGAASC GTGGGCCTA^ AGTCA3ATGA 55 0 

TGACGGGTTT GAGATAGTGC CTATTGAGGA CCCAGCGAAA CATCGGATAC TG3ACCCCGA 720 

AGGCCTTGCT CTAGGTGCTG TT ATTGCCT C TTCCAAAAAG GCCAAGAGAG A Z CTCAT AGA 78C 

15 

TAACTCCTTC AACCGGTACA CATTTAATGA GGATGAGGGG GAGCTTCCGG AGTGGTTTGT 340 

GCAAGAGGAA AAGCAGCACC GGAT ACGAC A GTTGCCTGTT GGTAAGAAGG A'GGTGGAGCA 90C 

20 TTACCGGAAA CGCTGGCGGG AAATCAATGC ACGTCCCATC AAGAAGGTGG CTGAGGCTAA 9*50 

GGCTAGAAAG AAAAGGAGGA TGCTGAAGAG GCTGGAGCAG ACCAGGAAGA AGGCAGAAGC 1020 

CGTGGTGAAC ACAGTGGACA TCTNCAGAAC GAGAGAAAGT GGCACAGCTG CGAAGTCTCT 1080 

25 

ACAAGAAGGC TGGGCTTGGC AAGGAGAAAC GCCATGTCAC CTACGTTGTA GCCAAAAAAG 1140 

GTGTGGGCCG CAAAGTGCGC CGGCCAGCTG GAGTCAGAGG TCATTTCAAG GTGGTGGACT 1200 

30 CAAGGATGAA GAAGGACCAA AGAGCACAGC AACGTAAGGA AGAAAAGAAA AAACACAAAC 1260 

GGAAGTAAGC AGAGCTGCCA GGCTCCCAGG AGAGCATGGG GACTAGGAGG AAGGGTGTGG 1320 

CATGGCTCAG TCTGGCCCCC TTGATTACCG GCCTAGCCCC TGCTCACATC ACAGCTGTCT 1230 

35 

GAAGAACAGT GAGGTGGAGT GCCTAGAACT CCCGTGGTGG TCCTGAGCAG AGAGGAGGAT 144 0 

GTCCTCCTGC CTGCCTGAAG GTCTCCCATG AAAACACTGC TGAAGTGTGT TGACACTCAT 150C 

40 GACCCTTTTT TTAAACCGTT AAAGGGAAGT TCGGTGTTGG AGCGATACTC AATGTAGTCA 1560 

GTCTACACCT GGACGTGTGG GCCACTTAAG CCCTCCCCAC CCCCATCCTA TTCCTRAATA 1520 

AAACCAGGAT AATGGAARAA AAAAAAAAAA AAAAAAAAAG GGGGGGCCCN TAAAGGGNCC 1580 

45 

CANNTTT 1687 



50 

(2) INFORMATION FOR SEQ ID NO: 160: 

(i) SEQUENCE CHARACTERISTICS : 

(A) LENGTH: 1842 base pairs 
55 (B) TYPE: nucleir acid 

(C) STRANDEDNESS : double 

(D) TOPOLOGY: linear 

(Xi) SEQUENCE DESCRIPTION: SEQ ID NO: 150: 

60 
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GGATGACAGA TTGCGACANA GATTTGTGAC 
ANCAGCGTAT GATCAAAGAC AAAGGCACGG 
5 CGCATCTGTG CC 2CGAGAAT CCTTTACTTC 
GAAACAAGAA GAAAAAAACC ATTCCTTCAC 
AGCTGCTTAA CAGTCCTGCA AAAACTCTGC 

10 

TTGATGGGTT TCTAAAACAT GAAGGACCTC 
CTTCTACTTG AGGTGTGCCA G3CCTTTCTA 
15 GACCTCCAGC ACCCAATCTA GCTGGAGCTG 
GAGAATGGAT AACTACAATT T 3AGATCCAA 
ACTGTACTGA TCTAATAGAA GAAAAAGATT 

20 

TGAAAAGGCT GATGCAGCAA TCGGTGGAAT 
TTGACAATGT CCAGGTGGTT TTACAACAAA 
25 TATTACCAGA GAGCCTGATG CTCTCTGATA 
CAAAGTGCAT GATAGTAATG CTCGGAGTTT 
GTTTTGTACA TTTCTTTTCA AAAAGTGCCA 

30 

GTTAATTATT TTACTGTAGC ATAGATTCTA 
GGATTTTTAC AGTGAAGTGT TTACAGTTGT 
35 RGGCTCCTTT TKGTGAAYCC TTAAAAACTC 
CTAAARGGCT GAAAAMCCTC CAGGCCAGAC 
AGACCGGGAT TCTACTTGTT CCAAGAAAGG 

40 

TCCAAGCATG AACACAGGAG CATGTYAAGA 
AATCTACATA TTTTGAATTA GAAACACCCT 
45 ACATTATGTC CCGTAGATCA GAGGTGGTGT 
ACTTTGATGA TAAAAAAGAA CGGTATAGAT 
TATATGTTAT GCCATAACTT TAAAATAAAA 

50 

TGGAACTTTT TCCTCAAACA AACACCCCAC 
ACAGATTACT ACTACGAATG AATCATYAAG 
55 AGC C CAAAT A TCAGGAAATG TGTGTATGAT 
TAAAACAGGA TCAAGGATTA ATGGTATAAA 
AAACAGGATC AAGGATTAAT GGTATAAAAA 

60 



412 



CCTTCCTGCT GAACTTCA.GA 3* 3-3 AGC TG AA 6 0 

CGAGAACAGC ACTCACCAGC ACTCAGCCAG 12 C 

ATCTAAAGGC A3CAGTGAAA 3AAAAGAAAA 18 0 

CAAAAAGGAT TGAGAGTCCT TTGAATAACA 24 0 

CAGGGGCCTG TGGCAGTCCC CAGAAGTTAA 3C0 

CTGCAGAGAA ACCCCTGGAA GAACTCTCTG 3c 0 

GTTTGCAGTC TGACCCAGCT '3GCTGTGTGA 420 

TTGAATTCAA TGATGTGAAG ACCTTGCTCA 480 

TGGAAGAAGA CATTCTCCAA GTTGTGAAAT 54 0 

TGGAAAAACT GGATCTAGTT ATAAAATACA 600 

CGGTTTGGAA TATGGCATTT <3ACTTTATTC 660 

CTTATGGAAG CACATTAAAA GTTACATAAA 720 

GCTGTGCCAT AAGTGCTTGT GAGGTATTTG 780 

TTATAATTTT AAATTTCTTT TAAAGCAAGT 840 

AATTTGTCAG TATTGCATGT AAATAATTGT 900 

TTTACAAAAT GTTTGTTTAT AAAGTTTTAT 960 

TTAATAAAGA AC TGT ATGT A TATTTGGTAC 1020 

AACTCTAGGA RGCAACTACT GTTTATTATA 1080 

TGCTAAGCTC TGAAATYCCT GAGAGGTCTC 1140 

GTAAAGCTTC TAAACCATCT TATTCTTGTC 1200 

AAATCTTTAC TACTTTCTYC CATGCGGAGA 1260 

CACACCCACT TGAAGATTTT TTTCCTGGGA 1320 

TGTCTTTTTG CTTCTACTGG CCATTGAGAA 1380 

TTTTCAAACG TATATAAAAT ATTTTTATGT 1440 

ATAGTTTAAA ATTCTATGCT AGTGGATATT 1500 

ACTGACTTCA GCAAAACCCT AAAACTAGCT 1560 

TTTTGTGTCT GCAACAATTT AGAAGCACTA 1620 

GGAATTTTCT AGGACAAAAC AGATCAAGAT 1680 

AATGGTCTAC TAAAACAGGA TCAAGGATTA 174C 

TCTCTACTGG TTACCGGGTG GCNGGGCCAT 1800 
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\CA3GG7AGT 3GTG3ATGGA T AGTTT AGTT TGGMAAGGGT AA 



5 

(2) INFORMATION FCR SEQ ID NO: 161: 

(l) SEQUENCE CHARACTERISTICS: 

:.A) LENGTH: 77 0 base pairs 
10 (3) TYPE: nucleic acid 

;C) STRA.NDEDNESS : double 
(D) TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 161: 

15 

GGCACGAGCC CTATGCTGTT CTTGTGATAA TGAGTGAGTC TCACAAGATC TGGTGGTGTT 60 

ATAGGCATCT GGCATTTCCC CTGCTGACGC TCATTCTCTA TCCTGCCACC CTGGGAAGAA 120 

20 GTGTCTTCTG TCATGATTGT AAGTTTCCTG AGGCCTCCCC AGCTATGTAG AAC TGTGAGC 180 

CAATTAAACC TCTTTTCTCT ATAAATTATC CAGTCTTATA TATTTCTTCA TAGCAGTGTG 249 

AGAACAGATA ATACCGTAAA TTGGTATCAC AGAGAGTGGG GTGTTGCTAT AAACACATCT 300 

25 

GAAAATGTTA AAGCAAATTT GGAACTGGGT AACAGGCAAA GGCTGGAACA GTTKGAAGAA 360 

CAGTTAAGAA GAAGACAGGA AAATATGAGA AATCTTGAAA CTTCCTAGAG TCTTAAAGGT 420 

30 CTCAGAAGAC ATGAAGATGT GGGAAGCTTT GGAACTTCCT AGAGACTTGT TTGAATGGCT 480 

TTGACCAAAA TGCTGATAGT GATATGGACA ATGAAGTCCA GGCTGAGCTT ATCCAGACAG 540 

ACATAAGAAG CTCGCTGGGA ACTTGAGTAA AGATCACTCT TGCTAGGCAA AGAGACTGGT 600 

35 

GGCCTTTTTT CCTCTGCCCT AGAGATCTGT GGAAATCTGA ACCTGAGAGA GATGATTTAG 660 

GGTATCTGGC AGAAGAAATA TCTAAGCGGC AAAACCTTCM AGAGGAAGCA GAGCATAAAC 72 0 

40 GTTTGAAAAA TTTGCAGCCT GACNATGGGA GACCAAAGTT AAACCCAATT 77 0 



45 (2) INFORMATION FOR SEQ ID NO: 162: 

(i) SEQUENCE CHARACTERISTICS : 

(A) LENGTH: 519 base pairs 

(B) TYPE: nucleic acid 
50 (C) STRANDEDNESS : double 

(D) TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 162: 
55 GAATTCGGCA CGAGCTGAGA GGCACAGGAG CAACAGCCAG TGCCCCCTGC AGAGGACCAC 60 
TGGGGTCACA GACTTCARAC CTGATGACCT GGGCTCAGAT CCCAGCTCTG CACCTACCAG 12 0 

CCGTGTGACA AGGTGTCCTC TCTGAGCCTC AGTCACACAC TGCCTTAACG GTTGGGCCTC 180 

60 
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ATGGAGCTGT TTGTGAAGGT TAAATGGGAA GACATAAAGC ACTTAGCCCA GAGCCAAGGA 24 0 

CATGCTGAAT AGGATAATGG TGGCCTCCTT TGGCGCTGTJ CTGGTGCAGG TGTGCCGAGG 3 CO 

5 AAYTGGGCAG GGGTGACAGA TACCTCTTCT AACCTAGTT : CTTTCCAAGA ACCTAATTGG 36 0 

TGTCTCTCCC TCCCCCAGGC AATTGGAAGG AGGAGGCTGG GCCCCAGC CC CAGAATACGG 42 0 

GAGGTTTCTC ACCGTGGTAG GGAAATTGCT GGGTTGGG3G TGTGGGCAAC CACAGTGATC 4BC 

10 

GTCTCTCTGC AGGACGGATG AGGCTTTGCT GACAGAGGC 519 



15 

(2) INFORMATION FOR SEQ ID NO: 163: 

(i) SEQUENCE CHARACTERISTICS : 

(A) LENGTH: 753 base pairs 
20 (B) TYPE: nucleic acid 

(C) STRANDEDNESS : double 

(D) TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 163: 

25 

GGCACGAGCG GCACGAGCAG 
GTGGGCACAC TAGAGCCGGA 
30 TGTTTAAATC TTTTCAGTGG 
GGGACTCATC CGCTCACAGC 
CCTGCCATGC CTCCGATACT 

35 

TCTTTGCCTG GCTGGTTGCT 
CATCTAGCCA GTCTAGCCCG 
40 TTTGTTWACT TGTGTGCTGT 
TCTCCCAAGC TCCTAGAATT 
TGCCGAATTC GGCACGAGGG 

45 

CATACGTGCA CACGCAGAAT 
CCCCTCCCTT TGSCCCTGCA 
50 GGTTCCTGAN CCCGCNAGCG 



CCAGTTGCTG ACTGGCACAT GGCCTCCAGC GTCCCGGCTG 60 

GGGATCTTCT TAATTGGTAA ATTGGATCTT GAAGCTTCAC 120 

CTTCCCTTTG TACTTAGAAA AAAATGCAAC TTCTTCTGCT 180 

CTTCCCCTCC ACCCTCTCTC TGCCTCATGC TCTGCCCCTG 240 

CACCTTTTGT ACCCCAGCAC CCGTGCCCTC TGCOCCTCGA 300 

CCTCACTCAG TGTTCAGGAC AAATGCTCCT GGCCCTACCC 360 

GTCTTCCCTG TCTTCCCTGT TTCATTCATG GCTCTTATTG 420 

TGACTTTTAA CTCTCTCAGT CCCCACTGGA ATGCAAGCGA 480 

GTTCCTGCCT CTTCACAGGC CCTTACGCTG TGTGTGCTCG 540 

TATGTGCACT TGCTGGTATG TATGTAGGTG TTTGCTAACA 600 

GCTTCCAGGG GACTGCACAG CCTCTAGTTC GCAGCCCCCA 660 

CTCTCCCCTC TCTGAGCTGC ATTCGCATGA AAGGGTGCAN 720 

NCACCTCCTG GGA 753 



55 (2) INFORMATION FOR SEQ ID NO: 164: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 1400 base pairs 

(B) TYPE: nucleic acid 
60 (C) STRANDEDNESS: double 
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( D ) TO?C UXT- : 1 - r.eax 



[xi) SEQUENCE DESC?-?7:CN: SEQ ID MO: 164: 
5 GGCACAGTTT ATTAATACCT ATTA7GGGAA AGTCACTTTG GTTGGCATTG AAAATTACAT 60 



CATCTTTAAA GCAGTATTTG TCCCCAGA7G GACTCATCAC TAGCAAAGAC TAGGTTCATT 12 C 



GGAAGGCATA GGGTGAGAGA ATCGGAAGAT G RAGTGGAGG CGGGTTGTTA AAGTGCTGTC 180 

10 

AGTGAGTGAT TTTGTCTACT TGAATAATGG TCCATGTTTG GGGGCATATT GTGTTTCATA 240 

AGAAGTGAAA GGTATTTGCA AAGTAAGCTA CAAATGACCC ATAAATCTGT TAACAACAGT 300 

15 CCTTAATATG CAAAGATGAA AAACAAGCAT T ACTGCT AC C CAAAGGGAAC TGGTGCTTGG 3 60 

TGATGTGCAG ATGGGGCTGT TGGTTAAGAG AGCTATTACA '3GTTTTCTCT CTTAGGTTTC 42 0 

ATAGGAGGTA GTTACTGAGA TGAGATTGTT TTATCTTTTT GAATACAGAT CTCTTGTCTT 480 

20 

GAGTTAGTTC TGAGGATGGG AGTAATAAAG GAGTTTTTTG TTTTTTTGTT TGTTTGTTTG 540 

TTTTGGCTCC TTAGTAATAC TGCTCTGACA TTTATTTCTA TTATTCTTCA AAGAAAGGAA 60 0 

25 ACCAACTGAA ATGTTTGCTT TAACAAACAT TTTAATAAGT TCTCTGGGTT TTTTTTTCCC 660 

CTTTTAAAAA AATTAGCATA T AC CAT AGO A ATAAAAGAAC TAATGTTAAC TATTGTATGC 720 

TACAACTTAA GTGATTTTTC TAAAGAAGCA CAATGTCATT GRAAGTATTA TTGAAAAGGA 730 

30 

TCATAGTCAC ATTGAATTTG TGAAGGCCAA AGAAATTGAA GGGAGTGATA TTTTCATTTT 340 

ATGATATTCA CATATTTAGT AAATTTTGTG TACAAGAATA CCAGGCAGAG TGTTTTACCC 900 

35 ATGG AAAC AG GTTTCAGATT ACTTTGTTTT TACTGTTAGA GTCTCAAGTT TAGAAATGCT 960 

AACACTTAAA TCAGTTTT TT TCTCACTATA CTTGAAGATT GTTAATATTT TGATATCTTC 1020 



CTAGCTTGAT GGAATTTAAA CATATCTTCA GATCTGTGAC AGTGACAGCC AATAGGACTG 1080 

40 

ATAATATTAG CTTCAAACCA ATAATATCCA GGGTTAAAAT AAAAATCATA GTGAAAGTAC 1140 

GATTGTAAAA TTATGCTATA TTAACTTTTA AGTCTGTAAT AACTTGACAT CAAAATGTTA 1200 

45 TGTAATTACC ATAAATAATG GCTAGCGAGA ACATCTTTGG AAATTCTCAA ATTACCTTTC 12 60 

TTACTACACT GTTTGCAGAA TGAATGTAGA AATGATCCTG TTAGCTTTCT GAATGTTCTG 132 0 



TGGTTGAATG TGTTTTTGCT TAAATAAAGC TTTTGGTATT TGTTTAAATW ACAAAAAAAA 1380 

50 

AAAAAAAAAA AAAAACTCGA 1400 



55 

(2) INFORMATION FOR SEQ ID NO: 163: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 2153 base pairs 
60 (3) TYPE: nucleic acid 
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(C) STRANDEENESS : double 

(D) TOPOLOGY: linear 

(xi ) SEQUENCE DESCRIPTION : SEQ ID NO : 165: 

5 

CAGGCCTCAG GGCCTCTGGT GGCTCTGGCC CAGACAGTAT TTGCAGTTCT TGTGCTATGG 6C 

GTGG3AGTCT TCTTCCTCAA GTTTCGGCAG CTGTGCTGTG NCTGGATGG3 CTGCTCCTCC 12 0 

10 CAGGGCTCAA GGGCTGTGGT CCGCTCAGGG TCTCATTTCC CCAGGCCAA3 TTCAAGGCAG 180 

CAGCCCTTTG TGAGGCGCTC TTGGCCCTGG GCTGGAGGGA GAACTTTAA3 CTTTTTTGCT 24 0 

CACA GGGACG TGGTATGGGC CCTGGGTGCA GGTGCCCACA TTCTGCTAAT GAGAGCTTTG 300 

15 

TCTGATCAGT CCTGGGTCCA TCAGTTTGTC CATGTGTCCG GCTGCCAGCC CGTICCTTGG 360 

GATCCTTCCC CTGGGGTGTA GCCTTGTTCA TTAGTATA7A CTCATTCCTT CATGCTTTCC 420 

20 TCAGCAGAAC ACTTCCACTT CTGAGGTGAG CTTTTGCCCC RTGCCCTTCC TCCACAGGTG 450 

TTGC CTTT TT ATAAAGACCT GATAGCAGAA TAAATTGGTG TTTCCCTGTT GACCCAGCAC 540 

CATTTCTGTG GGCCTAGAAT ATGGCCCTCA ACCCTTAGAG TGGGGCAGT'3 AGGGCTTGAG 600 

25 

GAGTGACCCT TCCTTTCTCA TGGTTTTAGT CATTTTGGCT GCCAGCCCTT AATGGCACAG 660 

ATCTGCTGCT TCTAACAGAT GGCCAGGAGG TGACACCGAT TTCAGCCATT GCCAAGGTTA 720 

30 GCACCCTCTC CTTTGAGCCT AGGGCCACAC TGTTCATTGT CACTTTAGGC AAGTGCCTGT 780 

TTGGCTTTAA AGGTAAGCCT GCCAGCTGTG AGAAGCCTTG GTAACTGATG GACTCATTTC 840 

CTGGTCCTTA AAGATGCAGC CTCTTAAGGG CTCCTTGATG GATGCCATCT CTCCTAGCCC 900 

35 

CCAGCCCTGG TGCCACTGGT GGGCAGGTTC CCATTCTTTG GGGCTGGGAG GGACAGCTTG 960 

CCTGTTTCTG GTCACAAATT ACAGTCTTCT CTCCTGTACC ATTCTGTGGC TTCAGCATGG 1020 

40 GGGCAGTAGC CTTTCATTAG TGTAGATAGT CATTCCCTGG TAGGGTGGAG GGTAAGACAT 1080 

AGGGTCTGGA ACTGTTTGGG ACCTTTTGGG GATGTCCTGT GCCTCCCAGA TTCCTMGATT 1140 

CTGGGAGGAG AGGCTGCCGC ATTCTGCTGC TCCTCACAGC GAGCAAAGCT GCACCCACTT 1200 

45 

ACATTCAGTA TTTTCCTGGC ACTACAAAGA GTGGGAAGGC CTGGGATTTG CTGCTGCTCC 1260 

CTTAGAGCAG GGCCCCTYTT TTCAGCACTT TGGACACCTG GAGACCCAGC CCTGTTATTT 1320 

50 AATGGTAGTG GGCAAGTGTG TGTGCATACT GTCTGCCACT GCTTTCTCCC TGCCCCATGC 1380 

CAGAGAGCCC TGTCCCTGCC AGGCCCAGCC TTCTTAGCCC CAACTTGGGA ACAAAGTGCA 1440 

ACATGGGATC ATGGGTTGGG GTGCTCAGGT GAGCCCTCTC TATAGTGCTT CCCTGGGCCA 1500 

55 

AGCTGACACC AGCCCCTGAG GGTGGGGTGG GACGGGTGGT GCTTAAAAGA GGAAGGGGAC 1560 

CAGTGTAGCA ACTTGCCAGG GACCCCACCC CTCCCTCTCT GGGCCTGTGC AGTGAGCATG 1620 

60 GGGATTCCCA TCAAGGGGCC TGGCACCTGT GCTAGTTACG TAGCCGCTGN TCACGCGCTC 1680 
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ACTCCTGACC ACATGCACGT TCCCTAGATG CAGACTGCTT TCAACTTTAA AGCTGTACAA 174 3 

TTTGGTTATG TTT3TGCTGA CTTAAAATAT ATTTTAATGA GG AAAAAAT A ATGGAGAACC 1800 

5 

CTGGGAAGGA CCTGGTTCTT TTGCTTCTCG GGGAACTGTA AGCCCTCGCG TTCTGGGAAT i86 0 

CGCTCTCTGC TGCTCTTTCC TGGAAGCTAA GCCTGTCTCC ACCGCCCGAG GCCTGCGCCG 192 0 

10 GTGCTCCCGC CGCAGTTGCG TTTGCTTTGG ACCTTGCGTG CGGGGGAGGG GGTGCTCGGT 198C 

CCGAGCCCGC TCCTTTCTGT ACACCTAGCG CTGCCCGCCC CGCTTGTGTC TGAGGTCGTG 2040 

TATGTCAAAA ATAAAGGCGC TAGAAACGGA AAAAAAAAAA AAAAAAAAAA AAAAAAAAAA 2100 

15 

AAACTCGAGG GGGGGCCCGT ACCCAATTAA CCCNNTATGA TCTATAAAGC GTC 2153 



20 

(2) INFORMATION FOR SEQ ID NO: 166: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 1251 base pairs 
25 (B) TYPE: nucleic acid 

(C) STRANDEDNESS: double 

(D) TOPOLOGY: linear 

<xi) SEQUENCE DESCRIPTION: SEQ ID NO: 16 6: 

30 

GCCCACGCGT CCGCCCACGC GTCCGGCGGT GCGGAGTATG GGGCGCTGAT GGCCATGGAG 60 

GGCTACTGGC GCTTCCTGGC GCTGCTGGGG TCGGCACTGC TCGTCGGCTT CCTGTCGGTG 12 0 

35 ATCTTCGCCC TCGTCTGGGT CCTCCACTAC CGAGAGGGGC TTGGCTGGGA TGGGAGCGCA 180 

CTAGAGTTTA ACTGGCACCC AGTGCTCATG GTCACCGGCT TCGTCTTCAT CCAGGGCATC 240 

GCC ATCATCG TCTACAGACT GCCGTGGACC TGGAAATGCA GCAAGCTCCT GATGAAATCC 300 

40 

ATCCATGCAG GGTTAAATGC AGTTGCTGCC ATTCTTGCAA TTATCTCTGT GGTGGCCGTG 360 

TTTGAGAACC ACAATGTTAA CAATATAGCC AATATGTACA GTCTGCACAG CTGGGTTGGA 420 

45 CTGATAGCTG TCATATGCTA TTTGTTACAG CTTCTTTCAG GTTTTTCAGT CTTTCTGCTT 480 

CCATGGGCTC CGCTTTCTCT CCGAGCATTT CTCATGOCCA TACATGTTTA TTCTGGAATT 540 

GTCATCTTTG GAACAGTGAT TGCAACAGCA CTTATGGGAT TGACAGAGAA ACTGATTTTT 600 

50 

TCCCTGAGAG ATCCTGCATA CAGTACATTC CCGCCAGAAG GTGTTTTCGT AAATACGCTT 660 

GGCCTTCTGA TCCTGGTGTT CGGGGCCCTC ATTTTTTGGA TAGTCACCAG ACCGCAATGG 720 

55 AAACGTCCTA AGGAGCCAAA TTCTACCATT CTTCATCCAA ATGGAGGCAC TGAACAGGGA 780 

GCAAGAGGTT CCATGCCAGC CTACTCTGGC AACAACATGG ACAAATCAGA TTCAGAGTTA 840 

AACAGTGAAG TAGCAGCAAG GAAAAGAAAC TTAGCTCTGG ATGAGGCTGG GCAGAGATCT 900 

60 
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ACCATGTAAA ATGTTGTAGA GATAGAGCCA TATAACGTCA CGTTTCAAAA CTAGCTCTAC 960 



AGTTTTGCTT CTCCTATTAG CCATA7GATA ATTGGGCTAT GTAGTATCAA TATTTACTTT 1020 



5 AATCACAAAG GATGGTTTCT TGAAATAATT TGTATTGATT GAGGCGTATG AACTGACCTG 1080 



AATTGGAAAG GATGTGATTA ATATAAATAA TAGCAGATAT AAATTGTGGT TATGTTACCT 1140 



TTATCTTGTT GAGGACCACA ACATTAGCAC GGTGCCTTGT GCAKAATAGA TACTCAATAT 120C 

10 

GTGAATATGT GTCTACTAGT AGTTAATTGG ATAAACTGGC AGCATCCCTG A 1251 



15 

(2) I NFORMAT I ON FOR SEQ ID NO: 167: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 882 base pairs 
20 (3) TYPE: nucleic acid 

(C) STRANDEDNESS : double 
CD) TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 167: 

25 

G AC SMTCT AG AACTATGGTC CCCCGGGACT GCAGGAATTC GGCACAGCGG CTGCGGGCGC 60 
GAGGTGAGGG GCGCGAGGTT CCCAGCAGGA TGCCCCGGCT CTGCAGGAAG CTGAAGTGAG 120 



30 AGGCCCGGAG AGGGCCCAGC CCGCCCGGGG CAGGATGACC AAGGCCCGGC TGTTCCGGCT 180 



GTGGCTGGTG CTGGGGTCGG TGTTCATGAT CCTGCTGATC ATCGTGTACT GGGACAGCGC 24 0 



AGGCGCCGCG CACTTCTACT TGCACACGTC CTTCTCTAGG CCGCACACGG GGCCGCCGCT 300 

35 

GCCCACGCCC GGGCCGGACA GGGACAGGGA GCTCACGGCC GAYTCCGATG TCGACGAKTT 360 



TCTGGACAAK TTTCTCAGTG CTGGCGTGAA GCAGAGTGAC YTTCCCAGAA AGGAGACGGA 420 
40 GCAGCCGCCT GCGCCGGGGA GCATGGAGGA GAGCGTGAGA RGCTACGACT GGTCCCCGCG 480 



CGAMGCCCGG CGCACCCAGA CCAGGGCCGG CAGCARGCGG ANCGGAGGAR CGTGCTGCGG 540 



GGCTTCTGCG CCAAYTCCAG CCTGGCCTTC CCCACCAAGG AGCGCGCATT CRACGACATC 600 

45 

CCCAACTCGG AGCTGAGCCA CCTGATCGTG GACGACCGGC ACGGGGCCAT CTACTGCTAC 660 



GTGCCCAAGG TGGCCTGCAC CAACTGGAAG CGCGTRATGA TCGTGCTGAG CGGAAGCTGT 720 



50 GCACCGCGTG CGCCTACCGC GACCCGYTGC GNTCCCGCGC GAGCACGTGC ACAACGCCAG 780 



CGCGCACTGA CTTCAACAAT TCTGGCGCCG CTACGGGAAG TCTCCCCCAC CTCATGAAGT 840 



CAAGCTCAAG AATACACCAA TTCTTTCTGC GCGACCCTTC TG 832 

55 



(2) INFORMATION FOR SEQ ID NO: 168: 

60- 
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{ i ) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 12 C 8 base pairs 
(3) TYPE: nucleic acid 

(C) STRAND E DICES S : do 'able 

[D) TOPOLOGY : linear 

ixi) SEQUENCE DESCRIPTION: SEQ ID NC : 168: 

GGGAAACTCA AAAGGATGAT GGAATGGTTG ATGGAGCCAG AGCCTAGAAG TRAAGGGATA 60 

CAGAGTGAAG ATAGAGGTAT TTACGTATAT TTWAATATTA GCTTTGGAAT TACGTAGGGA 120 

TTCTTAAGAA AAGATCATGA CAGGACAGCC ACATTTOGTA AAATGTCAGG GCAGCCAGTG 18C 

15 CATGGTCCTC CTGGGGCTCC TCAGTTGACG GGTTTAAATC ATTTCCTGAT CCCCCTGCCC 240 

TC^TTTGAGG AATGCATAC\ GTACGTGAAA TGCCTGTGGT ATGAGTTGCA ATGGGCAATC 300 

AACCTGGGTA AATCCAAGAT TAATGATTAG TTCTAAAGAT CCAGTTGAAG TTCTAGAGTG 360 

20 

GGAATTTTCC GTCAAGCARC TCAGCACAGC TTTATGCCTG TTCCTCTAAT A^CGATAGGT 420 

AACAAATAGC TGTGTKTWCA CAGCTAGGAR GATAACCAAA TCT AG AGTTC TTG ART CTC A 480 

25 TTTAATAAAT AAKTATTATG AGTACCAACT GCATATTTCA GGCACTGCAT TrGACTCTGT 540 

TAAATACTGA TYCCTTAKGA CMSCCACWTC AGAWAACMTT AATCTGTCTG ATCAATAAAC 600 

AGCTTGAC1T AGAGRGGTAA AATAGCTTGC CACAGGTWAC CCAATTAGTA GGTAACAGCG 660 

30 

ACAGAATAAC AGTGCAGTTA AAATCTTAGA CTGGAGACTA ATTGCATAAG TTTGAATTTC 72 0 

AGTTCTGCTA TGTAAATTTG GGTGAGTACC TTAATTYACC TGAGTCTCGG TCTTTATATC 780 

35 TGTAGAATGG AGCTAATGAT ATTACTTAAT TTGCTTTATG TGAGATTAAA TGTACTAATA 840 

TATGTAAATC ACTTACAACA GCATTTGACA TATTTGACAT ACTTAATATA TTTGCTACTA 900 

ATACTATTAG CAACAGCATT CTGATTTTCC AAGTTGAAAT TCAGTGTTTT CTTTTTTACT 960 

40 

TTGCCATAAT TTACAATGTT GTGCTCTGTA AACCATAAAT TTCCCTGAGG TGTTGTCAGG 1020 

TTAAAAAAAA ATCACTATGG CCCCCARNMA CTTGGAAAAT AGAAATGAGA CCAGCTTCAT 1080 

45 CTATATTCTT TACTGCAAAT AACTTAGAAT TGTAATAGGC TAATATGTAC TCGGACTTCC 1140 

AATTTGGGAA TATGACAAAA ATAATACTAT TTAGCTAAAA CATATACAGA ACTTATTTTT 1200 

CCTCTGAA 1208 



50 



55 



(2) INFORMATION FOR SEQ ID NO: 169: 



(i) SEQUENCE CHARACTERISTICS : 

(A) LENGTH: 1307 base pairs 
(3) TYPE: nucleic acid 
(C) STRANDEDNESS : double 
60 (D) TOPOLOGY: linear 
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(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 163: 

GGCACGAGAG AAAAGAGGTT GAGAATGTTT TCTAGCAGGC AGAATGTGCA TACATGTTTT 60 

5 

CATGARTGTC CTTTGGGTGC TGTTTCTTTT AAATCCTCTG TGCACAGGGC TC1X3GCCTTT 12 C 

ARTAAACTGT TTTTCTGTCT TACGTCATGC TGACTGGGTG CTAGGGGCTG ATTACAAAGG 180 

10 GGAAGAGTTG AACAGACATC AGGGGCCGAT GAAACCAAAG GACTAGGAGT CAGGAGAACA 240 

AGTCAGGGAT TAGGAGACAG CGGTTTOGTT TATTGTTATC CAGCTGGAGG ACTCCTAGGG 300 

GCAGCAGCAG GAGGAATACC AGGGCCACGG AGGGGCAGGA GTCTCACAGT GGAGGGCAGA 360 

15 

CTCTAACAGA TGCCAGCTGA ACGCTCGCTG GCCCTGGATG TCATACGAGT TGGGGACCAG 42 0 

AAATCTGGGC TCAGAGAACC CGTCCAGGGA GATTTGAAGC CATGGGTTAT CTTCTAGAGT 430 

20 TGATACTGAT AATATATTTT AATTTTTATT GATGTTTAAT ACCTTCTGAA ACAGGAGGGT 54 0 

AAGATCAGAT GGGAAGCCCY TCTGTTGAAG GATCTTGGGA ACCTTGGTGG TTTTTTTTTT 600 

TTGGTTTTTT TTTTTTTGAT CGAGCTGTGG ACATCCTTCT TAATTCGATT NTGAGGATTT 660 

25 

GTTTAACTAA AAAGTTCCCA AACACAGAAA GGGCCTCCCC ACCTGCTTTG GGGAGCTGTC 72 0 

TGTSCTGGGA GTGCCAGGCA TCCSATGGGA CCCATCACTG CCAGTGTCTG TGCCTCCCAG 78 0 

30 AGGTCAGCCC TGTGTCTGCC CTGGCTCTGT CTCCTCTGTG ACAGGGCAGA GCATTTCTGG 840 

TCAGTTTCTC CATGGTGCCT CCCACCCCTT TGTAAAGTGG ATGGACATGA TGGAATTCAG 900 

TTGTCTCACC CTGATAGCCT GGGTGTTGAT ATTCACTTTA CCCGCACTCA GACACAGGCG 960 

35 

ACCTTGAAGC AGTTCTCGGT GTGTAGAGTC CACGTGACAG TCCCCACAGC CTCCCCAGAT 1020 

AGCTGTGTGC CTGTGCGCTA CTGCTGTGCC ATTTTCCCAA CTTNGGCGTT TCACTAAATG 1080 

40 CAGCTGATCT CTCTCTCTGT GCACTCGTGA TCCATGTTGA ACAATACATG TAGGTTCTTT 1140 

TTCCACGCAA TGTAAGAACA TGATATACTG TACGTTGGAA AGCATTTACC TTATTTATAT 1200 

ACCTGAATGT TCCTACTACA CAAATAAACA TATATTAAAT WCTAAAAAAA AAAAAAAAAA 1260 

45 

CTGGAGGGGG GGCCCGGTAC CCAAATCGCC GGATAGTGAT CGTAAAC 1307 

50 

(2) INFORMATION FOR SEQ ID NO: 170: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 1624 base pairs 
55 (B) TYPE: nucleic acid 

(C) STRANDEDIIESS : double 

(D) TOPOLOGY . linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 170: 

60 
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10 



15 



20 



25 



30 



35 



40 



45 



50 



55 



GGCACGAGGT C3CCGCCGCG GCCGCCTGGA ATTGTGGGAG TTGTGTCTGC CACTCGGCTG 60 

CCGGAGGCGA AGGTCCCTGA CTATGGCTCC CCAGAGCCTG CCTTCATCTA GGATGGCTCC 120 

TCTGGGCATG CTGCTTGGGC TGCTGATGGC CGCCTGCTTC ACCTTCTGCC TCAGTCATCA 180 

GAACCTGAAG GAGTTTGCCC TGACCAACCC AGAGAAGAGC AGCACCAAAG AAACRGAGAG 240 

AAAAGAAACC AAAGCCGAGG AGGAGCTGGA TGCCGAAGTC CTGGAGGTGT TCCACCCGAC 300 

GCATGAGTGG CAGGCCCTTC AGCCAGGGCA GGCTGTCCCT GC AGGATCCC ACGTACGGCT 360 

GAATCTTCAG ACTGGGGAAA GAGAGGCAAA ACTCCAATAT GAGGACAAGT TCCGAAATAA 420 

TTTGAAAGGC AAAAGGCTGG ATATCAACAC CAACACCTAC ACATCTCAGG ATCTCAAGAG 480 

TGCACTGGCA AAATTCAAGG AGGGGGCAGA GATGGAGAGT TCAAAGGAAG ACAAGGCAAG 540 

GCAGGCTGAG GTAAAGCGGC TCTTCCGCCC CATTGAGGAA CTGAAGAAAG ACTTTGATGA 600 

GCTGAATGTT GTCATTGAGA CTGACATGCA GATCATGGTA CGGCTGATCA ACAAGTTCAA 660 

TAGTTCCAGC TCCAGTTTGG AAGAGAAGAT TGCTGCGCTC TTTGATCTTG AATATTATGT 720 

CCATCAGATG GACAATGCGC AGGACCTGCT TTCCTTTGGT GGTCTTCAAG TGGTGATCAA 780 

TGGGCTGAAC AGCACAGAGC CCCTCGTGAA GGAGTATGCT GCGTTTGTGC TGGGCGCTGC 840 

CTTTTCCAGC AACCCCAAGG TCCAGGTGGA GGCCATCGAA GGGGGAGCCC TGCAGAAGCT 900 

GCTGGTCATC CTGGCCACGG AGCAGCCGCT CACTGCAAAG AAGAAGGTCC TGTTTGCACT 960 

GTGCTCCCTG CTGCGCCACT TCCCCTATGC CCAGCGGCAG TTCCTGAAGC TCGGGGGGCT 1020 

GCAGGTCCTG AGGACCCTGG TGCAGGAGAA GGGCACGGAG GTGCTCGCCG TGCGCGTGGT 1080 

CACACTGCTC TACGACCTGG TCACGGAGAA GATGTTCGCC GAGGAGGAGG CTGAGCTGAC 1140 

CCAGGAGATG TCCCCAGAGA AGCTGCAGCA GTATCGCCAG GTACACCTCC TGCCAGGCCT 1200 

GTGGGAACAG GGCTGGTGCG AGATCACGGC CCACCTCCTG GCGCTGCCCG AGCATGATGC 1260 

CCGTGAGAAG GTGCTGCAGA CACTGGGCGT CCTCCTGACC ACCTGCCGGG ACCGCTACCG 1320 

TCAGGACCCC CAGCTCGGCA GGACACTGGC CAGCCTGCAG GCTGAGTACC AGGTGCTGGC 1380 

CAGCCTGGAG CTGCAGGATG GTGAGGACGA GGGCTACTTC CAGGAGCTGC TGGGCTCTGT 1440 

CAACAGCTTG CTGAAGGAGC TGAGATGAGG CCCCACACCA GGACTGGACT GGGATGCCGC 1500 

TAGTGAGGCT GAGGGGTGCC AGCGTGGGTG GGCTTCTCAG GCAGGAGGAC ATCTTGGCAG 1560 

TGCTGGCTTG GCCATTAAAT GGAAACCTGA AGGCCAAAAA AAAAAAAAAA AAAAAAAAAA 1620 

AAAA 1624 



60 



(2) INFORMATION FOR SEQ ID NO: 171: 
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!i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 2003 base pairs 
(E) TYPE: nucleic acid 
5 (C) STRANDEDNESS : double 

(D) TOPOLOGY : linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 171: 

10 GGCACG AGC C AGCTTGCAGG AGGAATCGGT GAGGTCCTGT CCTGAGGCTG CTGTCCGGGG 60 



CCGGTGGCTG CCCTCAAGGT CCCTTCCCTA GCTGCTGCGG TTGCCATTGC TTCTTGCCTG 12 C 



TTCTGGCATC AGGCACCTGG ATTGAGTTGC ACAGCTTTGC TTTATCCGGG CTTGTGTGCA 180 

15 

GGGCCCGGCT GGGCTCCCCA TCTGCACATC CTGAGGACAG AAAAAGCTGG GTCTTGCTGT 240 



GCCCTCCCAG GCTTAGTGTT CCCTCCCTCA AAGACTGACA GCCATCGTTC TGCACGGGGC 300 



20 TTTCTGCATG TGACGCCAGC TAAGCATAGT AAGAAGTCCA GCCTAGGAAG GGAAGGATTT 360 



TGGAGGTAGG TGGCTTTGGT GACACACTCA CTTCTTTCTC AGCCTCCAGG ACACTATGGC 42 0 



CTGTTTTAAG AGACATCTTA TTTTTCTAAA GGTGAATTCT CAGATGATAG GTGAACCTGA 480 

25 

GTTGCAGATA T AC CAACTTC TGCTTGTATT TCTTAAATGA CAAAGATTAC CTAGCTAAGA 540 



AACTTCCTAG GGAACTAGGG AACCTATGTG TTCCCTCAGT GTGGTTTCCT GAAGCCAGTG 600 



30 ATATGGGGGT TAGGATAGGA AGAACTTTCT CGGTAATGAT AAGGAGAATC TCTTGTTTCC 660 



TCCCACCTGT GTTGTAAAGA TAAACTGACG ATATACAGGC ACATTATGTA AACATACACA 720 



CGCAATGAAA CCGAAGCTTG GCGGCCTGGG CGTGGTCTTG CAAAATGCTT CCAAAGCCAC 780 

35 

CTTAGCCTGT TCTATTCAGC GGCAACCCCA AAGCACCTGT TAAGACTCCT GACCCCCAAG 840 



TGGCATGCAG CCCCCATGCC CACCGGGACC TGGTCAGCAC AGATCTTGAT GACTTCCCTT 900 



40 TCTAGGGCAG ACTGGGAGGG TATCCAGGAA TCGGCCCCTG CCCCACGGGC GTTTTCATGC 960 



TGTACAGTGA CCTAAAGTTG GTAAGATGTC ATAATGGACC AGTCCATGTG ATTTCAGTAT 1020 



ATACAACTCC ACCAGACCCC TCCAACCCAT ATAACACCCC ACCCCTGTTC GCTTCCTGTA 1080 

45 

TGGTGATATC ATATGTAACA TTTACTCCTG TTTCTGCTGA TTGTTTTTTT AATGTTTTGG 1140 



TTTGTTTTTG ACATCAGCTG TAATCATTCC TGTGCTGTGT TTTTTATTAC CCTTGGTAGG 1200 



50 TATTAGACTT GCACTTTTTT AAAAAAAGGT TTCTGCATCG TGGAAGCATT TGACCCAGAG 1260 



TGGAACGCGT GGCCTATGCA GGTGGATTCC TTCAGGTCTT TCCTTTGGTT CTTTGAGCAT 1320 



CTTTGCTTTC ATTCGTCTCC OrXTTTGGT TCTCCAGTTC AAATTATTGC AAAGTAAAGG 1380 

55 

ATCTTTGAGT AGGTTCGGTC TGAAAGGTGT GGCCTTTATA TTTGATCCAC ACACGTTGGT 1440 



CTTTTAACCG TGCTGAGCAG AAAACAAAAC AGGTTAAGAA GAGCCGGGTG GCAGCTGACA 1500 
60 GAGGAAGCCG CTCAAATACC TTCACAATAA ATAGTGGCAA TAT AT AT ATA GTTTAAGAAG 1560 
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GCTCTCCATT TGGCATCGTT TAATTTATAT GTTATGTTCT AAGCACAGCT CTCTTCTCCT 

ATTTTCATCC TGCAAGCAAC TCAAAATATT TAAAATAAAG TTTACATTGT AGTTATTTTC 

AAATGTTTGC TTGATAAGTA TTAAGAAATA TTGGACTTGC TGCCGTAATT TAAAGGTCTG 

TTGATTTTGT TTCGGTTTGG ATTTTTGGGG GAGGGGAGCA CTGTGTTTAT GCTGGAATAT 

GAAGTGTGAG ACCTTCCGGT GCTGGGAACA CACAAGAGTT GTTGAAAGTT GACAAGCAGA 

CTGCGCATGT CTCTGATGCT TTGTATCATT CTTGAGCAAT CGCTCGGTCC GTGGACAATA 

AACAGTATTA TCAAAGAGAA AAAAAAAAAA AAAAAACTCG NGGGGGGGCC CGGTACCCAA 
TTCGCCCTAT AGTGAGCCNA TTC 

(2) INFORMATION FOR SEQ ID NO: 172: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 786 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNES S : double 

(D) TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 172: 
GGCACAGCGG CACGAGAAGA CTTTGGTGTT TAAGAGATTA ATGTGTTAGC CAGAACAACT 
CATTTCTCTA CCMGTGTGTA GTCCATTTAT CTTTAAAGAT TTTCTATTGG AATAATTTTG 
AAATTACTTT CITAGTTTTC TTCATTAAAA ACTAAGAAAA TGCTTTGTTT ATTATGAATT 
GCTATTTCTC TTGATTATTA TTCTTGGAGA AAGTCTATCA GACGTAATTC TTCTGATTTG 
CTTCTAGGCT AGAGGAAAAT GTGAAAGATG ACAAATGAAA ATTTCAAAGG TTGTCAGTAG 
TATGACTTCT TTTATCGTTT GTCATTATCA CAAATATATC AACATAGGAC TTTTAAAAGA 
TATTTTGTAC ATATTGGGCC TTAGTAGGAT TTTGCATGAA TTTTTTTTTT CTTTTATGCC 
CAGAGAGAAA GAGCAAAGAA ATAACCAAGG GTGATGTACT CGTATTGAAG GTTTACCAAA 
TAAGGACTGC TTTTATTATG AACTATAGTC TATATTCTAA GTAAATCAAT TTTTCTATTA 
TCT 3TTTTTT GTTCCTGCAG GCAAGATCTC TGAACTTTAT GCAGAGGGTT CTTTTAAAAA 
AACAAAGTTG AATTTTTTTA TTTCTTGGAA TATTTTTTTT CATTGATTTC TCCCAAGTAG 
AGCAGATTCA AATCTCCTTT GTACCCTATG TCTTTTTTGT TTTGCTATTA GCTCAGTATT 
CCGTTTCTAC ATTTTCCTTT CCTAGAACCA GTCAATAAAT GACAAAAAAA AAAAAAAAAA 
ACTCGA 



WO 98 54963 




TLS98/U422 



<2) INFORMATION FOR SEQ ID NO: 173: 

(i) SEQUENCE CHARACTERISTICS: 
5 ;A) LENGTH: 1758 base pairs 

(3) TYPE: nucleic acid 

(C) ST RAND E ONES S : double 

(D) TOPOLOGY : linear 

10 (xi) SEQUENCE DESCRIPTION: SEQ ID NO: 173: 

GGGACGAGCC CTGCCCACCT CCTGCAGCCT CCTGCGCCCC GCCGAGCTGG CGGATGGAGC 50 

TGCGCACGGG GAGCGTGGGC AGCCAGGCGG TGGCGCGGAG GATGGATGGG GACAGCCGAG 12 0 

15 

ATGGCGGCGG CGGCAAGGAC GCCACCGGGT CGGAGGACTA CGAGAACCTG CCGACTAGCG 18 0 

CCTCCGTGTC CACCCACATG ACAGCAGGAG CGATGGCCGG GATCCTGGAG CACTCGGTCA 240 

20 TGTACCCGGT GGACTCGGTG AAGACACGAA TGCAGAGTTT GAGTCCAGAT CCCAAAGCCC 300 

AGTACACAAG TATCTACGGA GCCCTCAAGA AAATCATGCG GACCGAAGCT TCTGGAGGCC 36C 

CTTGCGAGGC GTCAACGTCA TGATCATGGG TGCAGGGCCR GCCCATGCCA TGTATTTTGC 420 

25 

CTGCTATGAA AACATGAAAA GGACTTTAAA TGACGTTTTC CACCACCAAG GAAACAGCCA 480 

CCTAGCCAAC GGTATTTTGA AAGCGTTTGT CTGGAGTTAG AAAGTTCTCT TCTTCAACAC 540 

30 GTCCCTCCCC AGGGTGTTCC TCCCTGTGAC CCAGCCGCCT CGACTTCGGC CCC<:TTGCTC 600 

ACGAATAAAG AACTCAGAGT TGTGTGTGCA ATGCACACCC AGACACACGC ACGCACACAC 660 

ACGCGCGCGC ACACACATGC TTTTTTCTGT TCCCCTCCGC TTTCTGAAGC CTGGGGAGAA 720 

35 

ATCAGTGACA GAGGTGTTTT GGTTTTATTG TTATGTGGGT TTTCTTTTGT ATTTTTTTTG 780 

TTTGTTTTGT TTTTAAACAT TCAAAAGCAA TTAATGATCA GACATAGGAG AAACCCTGAA 840 

40 TAGAAACAAA ACTTTTGAAT GCTGGATTCA AAAAAAAAAA AAAGTTATCT GGACAGCTTC 900 

TTTGAGACTA TTTAAAAACT GGTACAACAG GTCTCTACAA CGCCAAGATC TAACTAAGCT 960 

TTAAAAGGTC AAGAAGTTTT ATGGCTGACA AAGGACTCGC GCAACGCAGA AGGCCTTTCC 1020 

45 

CACCTTAAGC TTCCGGGGAT CTGGGAATTT TACCCCCATT CTCTTCTGTT TGTCTGAGTC 1080 

TCATCTCTCT GCAAGCAAGG GCTGAAATCA TTTTGTTTGG TTGTTTTGAG GGAGAGAGGC 1140 

50 GGGGTGGGGG GGTGCAAATC TGCCAGCAGC TCTTACGTAA GGCATGTTTT ATTGGGGAGG 1200 

GCTGAGCTTT TATTTTCTCC TCTCCAGTGG GGTTGGCTTT TATTGTTTCT TGTTTGGGTT 1260 

TGGAATGGAA ATATGGATAG CAGCATAAAG TACTTTTATT TTGACAAAAT TCATTTTTTT 1320 

55 

CAACAATGGA GACATAGATT TGACCCACAA TAACTTCTCC CCCTCTCTTT TTACTCTGCT 13 80 

CAAAAAGCAT CTCTCCTCCC ATTACCCAAC CTTGGTCATA AGTGTGCCTG GCTGGTTTGC 1440 

60 AGATATTTGT TCTGCTTTGT AAAAATTGGC CATTAGTGCA TTTATTGAGA TGATCTCTAA 1500 
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AGAGCTATGC CCTGACCTAC CCCTGATTCT ATGACATTGG GGCCCTTCTT TTGCTGAAAC 1560 

TGCCTTACGT AATGGTTTTA 7TCCTTGAAA GAGATTTGAC GGAATCCATT TTATGCCAAG 162C 

5 

TGCTGCCCTG CACTGTTTCT GCAATATGT3 GTGTATGCTG TGGTGATCTT GCTGGGAATG 16 BC 

ATTATAAGTG TGTGTGTGGT 3GGGGAGT3G GTATTACATG CATTGCTGAA GAGTCAAAAA 1740 
10 AAAAAAAAAA AAACTCGA 1758 



15 (2} INFORMATION FOR SEQ ID NO: 174: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 888 base pairs 
(3) TYPE: nucleic acid 
20 (C) STRANDEDNESS : double 

(D) TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 174: 

25 CTGTTAGAAT GCCCAGTTTA CCTGGATGGC AACCCAACAG TGCTCCTGCC CACCTGCCCC 60 

TCAATCCTCC TAGAATTCAG CCCCCAATTG CCCAGTTACC AATAAAAACT TGTACACCAG 120 

CCCCAGGGAC AGTCTCAAAT GCAAATCCAC AGAGTGASMC ACCACCTCGG GTAGAATTTG 180 

30 

ATGACAACAA TCCCTTTAGT GAAAGTTTTC AAGAACGGGA ACGTAAGGAA CGTTTACGAG 240 

AACAGCAAGA GAGACAACGG ATCCAACTCA TGCAGGAGGT AGATAGACAA AGAGCTTTGC 300 

35 AGCAGAGGAT GGAAATGGAG CAGCATGGTA TGGTGGGCTC TGAGATAAGT AGTAGTAGGA 360 

CATCTGTGTC CCAGATTCCC TTCTACAGTT CCGACTTACC TTGTGATTTT ATGCAACCTC 420 

TAGGACCCCT TCAGCAGTCT CCACAACACC AACAGCAAAT GGGGCAGGTT TTACAGCAGC 480 

40 

AGAATATACA ACAAGGATCA ATTAATTCAC CCTCCACCCA AACTTTCATG CAGACTAATG 540 

AGCGAGGCAG GTAGGCCCTC CTTCATTTGT TCCTGATTCA CCATCAATCC CTGTTGGAAG 600 

45 CCCAAATTTT TCTTCTGTGA AGCAGGGACA TGGAAATCTT TCTGGGACCA GCTTCCAGCA 660 

GTCCCCAGTG AGGCCTTCTT TTACACCTGC TTTACCAGCA GCACCTCCAG TAGCTAATAG 720 

CAGTCTCCCA TGTGGCCAAG ATTCTACTAT AACCCATGGA CACAGTTATC CGGGATCAAC 780 

50 

CCAATCGCTC ATTCAGTTGT ATTCTGATAT AATCCCAGAG GAAAAAGGGN AAAAAAAARA 840 

AMAARAAARA ARAAAGGAGA TGATGATGCA GAATTCCACC AAGGCTCC 888 

55 



(2) INFORMATION FOR SEQ ID NO: 175: 
60 (i) SEQUENCE CHARACTERISTICS: 
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(A) LENGTH: 2379 base pairs 
(E) TYPE: nucleic acid 

(C) STRAKDEDNESS : double 

(D) TOPOLOGY : linear 

5 

(xi) SEQUENCE DESCPIPTION: SEQ ID NO: 175: 

GGCAGAGCTA GTGTGGACTC CATCCCCCTG GAGTGGGATC ACGNCTATGA COTCAGTCGG 6 0 

10 GACCTGGAGT CTGCAATGTC CAGAGCTCTG CCCTCTGAGG ATGAAGAAGG TCAGGATGAC 120 

AAAGATTTCT ACCTCCGGGG AGCTGTTGSC TTATCAGGGG AC CAC AGTGC CZTAGAGTCA 130 

CAGATCCGAC AACTGGGCAA AGCCTGGATG ATAGCCGCTT TCAGATACAG CAAACCGAAA 2 10 

15 

ATATCATTCG CAGCAAAACT CCCACGGGGC CGGAGCTAGA CACCAGCTAC AAAGGCTACA .300 

TGAAACTGCT GGGCGAATGC AGTAGCAGTA TAGACTCCGT GAAGAGACTG GAGCACAAAC J6 0 

20 TGAAGGAGGA AGAGGAGAGC CTTCCTGGCT TTGTTAACCT G7ATAGTACC GAAACCCAAA 420 

CGGCTGGTGT GATTGACCGA TGGGAGCTTC TCCAGGCCCA GGCATTGAGC AAGGAGTTGA 480 

GGATGAAGCA GAACCTCCAG AAGTGGCAGC AGTTTAACTC AGACTTGAAC AGCATCTGGG 540 

25 

CCTGGCTGGG GGACACGGAG GAGGAGTTGG AACAGCTCCA GCGTCTGGAA CTCAGCACTG 600 

ACATCCAGAC CATCGAGCTC CAGATCAAAA AGCTCAAGGA GCTCCAGAAA GCTGTGGACC 660 

30 ACCGCAAAGC CATCATCCTC TCCATCAATC TCTGCAGCCC TGAGTTCACC CAGGCTGACA 72 0 

GCAAGGAGAG CCGGGACCTG CAGGATCGCT TGTSGCAGAT GAATGGGCGC TGGGACCGAG 780 

TGTGCTCTCT GCTGGAGGAG TGGCGGGGCC TGCTGCAGGA TGCCCTGATG CAGTGCCAGG 84 0 

35 

GTTTCCATGA AATGAGCCAT GGTTTGCTTC TTATGCTGGA GAACATTGAC AGAAGGAAAA 900 

ATGAAATTGT CCCTATTGAT TCTAACCTTG ATGCAGAGAT ACTTCAGGAC CATCACAAAC 960 

40 AGCTTATGCA AATAAAGCAT GAGCTGTTGG AATCCCAACT CAGAGTAGCC TCTTTGCAAG 102 0 

ACATGTCTTG CCAACTACTG GTGAATGCTG AAGGAACAGA CTGTTTAGAA GCCAAAGAAA 1080 

AAGTCCATGT TATTGGAAAT CGGCTCAAAC TTCTCTTGAA GGAGGTCAGT CGTCATATCA 1140 

45 

AGGAACTGGA GAAGTTATTA GACGTGTCAA GTAGTCAGCA GGATTTGTCT TCCTGGTCTT 1200 

CTGCTGATGA ACTGGACACC TCAGGGTCTG TGAGTCCCAY ATCAGGAAGG AGCACCCCAA 1260 

50 ACAGACAGAA AACGCCACGA GGCAAGTGTA GTCTCTCACA GCCTGGACCC TCTGTCAGCA 1320 

GTCCACATAG CAGGTCCACA AAAGGTGGCT CCGATTCCTC CCTTTCTGAG CCARGGCCAG 1380 

GTCGGTCCGG CCGCGGCTTC CTGTTCAGAG TCCTCCGAGC AGCTCTTCCC CTTCAGCTTC 1440 

55 

TCCTGCTCCT CCTCATCGGG CTTGCCTGCC TTGTACCAAT GTCAGAGGAA GACTACAGCT 1500 

GTGCCCTCTC CAACAACTTT GCCCGGTCAT TCCACCCCAT GCTCAGATAC ACGAATGGCC 1560 

60 CTCCTCCACT CTGAACTAAG CAGATGCCAT CTGCAGAAGT GCTGGTAGCA TAAGGAGGAT 162 0 
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C3-3GTCATAA GCAATCCCAA A7TACCAACA AGAGGACCTT GATCTTGGC3 AAAC-CCl-TrCG 15 8C 

ijT^jTGGCAGr TTTAGCCTCC TCCAGATCAC ATGTGTGCAA ATTATGGCTT CAGA3GTG3A 174 0 

5 

AGATAAACAG TGACGGGGGA ACAAACAGAC AACAAGAAGG TTTGGAA3AA AT2TGGTTTG 130C 

AGACTC TGAA CCTTAGCACT AAGGAGATTG AGTAAGGACC TCCAAAGTTC CC 2G3ACTCA 1360 

10 TGAATTCTGG GCCCTTGGCC KATTCTGTGC ACAGCCAAGG ACTTCAGTAG AC ZATTTGGG 192 0 

CAGCTTTCCC ATGGTGCT3C T2CAACCATC AGATAAATGA CCCTCCCAAG CA-CATGTCA 196 0 

GTGTCGTACA ATCTACCAAC CAACCAGTGC TGAAGAGATT TTAGAACCTT GTAA2ATACA 2040 

15 

ATTTTTAAGA GCTTATATGG CAGCTTCCTT TTTACCTTGT TTTCCTTTGG GG2ATGATGT 2100 

TTTAACCTTT GCTTTAGAAG CACAAGCTGT AAATCTAAAA GGCACTTTTT TTTAGAGGTA 2160 

20 TAAAGAAAAA CTAGATGTAA TAAATAAGAT CATGGAAGGC TTTATGTGAA AAAA3TTGAA 2 22 0 

TGTTATAGTA AAAAAAAAAG ATATTTATGT ATGTACAGTT TGCTAAAGCC AAGTTTTGTT 2280 

TGTATTGATT TCTTTGCATT TATTATAGAT ATT AT AAAAT AAAAAAAAAA AAAAAAAAAG 2340 

25 

TCGAGGGGGG GCCCGGTACC CAATTCGCCC TATAGTGAG 2 37 9 



30 

(2) INFORMATION FOR SEQ ID NO: 176: 

(i) SEQUENCE CHARACTERISTICS : 

(A) LENGTH: 13 48 base pairs 
35 (B) TYPE : nucleic acid 

(C) STRANDEDNESS : double 

(D) TOPOLOGY: linear 

(Xi) SEQUENCE DESCRIPTION: SEQ ID NO: 176: 

40 

<3CGCCTTCAC GATGCCGGCG GTCAGTGGTC CAGGTCCCTT ATTCTGCCTT CTCCTCCTGC 60 

TCCTGGACCC CCACAGCCCT GAGACGGGGT GTCCTCCTCT ACGCAGGTTT GAGTACAAGC 120 

45 TCAGCTTCAA AGGCCCAAGG CTGGCATTGC CTOGGGCTGG AATACCCTTC TGGAGCCATC 180 

ATGGAGGTGA GGGGCAGGGG TGGGGACCGC TATGCCCAGG GTCCCTCAAA GTGCTGGAGG 240 

GGCTGTRACT TGGTGGGGAG TGGGTCTGTC ACAGCCATCC TCTGTCCAGG GTGGGGCAAG 300 

50 

GCCTGGGACA GTGCCAGGCA CCCCAGGACC CCTTCCAGGC TTGTCTCCTG CTCCACCGCC 360 

TCAACACCCC CCACCCCTGC CCAAGCTGTT TCTCCTCTGC CTCTCTNNTT CCCTGCCCCA 420 

55 GGACTTCTCT CTTCTCCTCT GCCTCTCCTT GGACCCCTGC CCTTCCTCTA CCTCTGACCT 480 

GTGAACACAC AGACACATGC TCACACACTA AGTCC CARGO ACACMSAAAG GCAATGTGGA 54 0 

CCAGCACAAA CCTCCACTCT CCCGGCTCCA TCCCARCGGG CCTGTGGCTG GCCATGAAAA 600 

60 
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rTGGGGGCTA CCTGGA3GGA AGCATC'CTCA TCC CAGGTGA GTGGGCACCA GCCCTTCCCT 66 3 

3TATGTGTGT TGTGGGTGGA AGCAGGCATG AGA3CATCTT AGCCCATA3G TTTGTATTCA 72 3 

5 3GGACTTCCA AACCCAGACC TA3AAAGA3T GTGTCTTCTA CCAGATHTTG TTCAAAAAA3 783 

r-GTTTGTGAT 3ATGGAACTA CACGATAGAG GGAGTGAGCA AGAACAAT 3A GG ATT A3 ACT 840 

GGAGCGTGAA ATAGTCTAGG AGCATGGCTT CCAAAACATA TGCTGT3A3G TCTGTCCAC3 900 

10 

rGAGAGTTGG GCCATGGATT TAATTCTGAG CCTCTTAGCA GGCAAA3CAA AGACAGAAAG 960 

CAGATCGGCT GTGGATTTCT GTCTATAAAA TGTGAGTTCT TGGCCGGGTG CGGTGGCTCA 102 0 

15 CGCCTGTAAT CCCGGCGCTT TGGGAGGCCA GGGCGGATGG GTCGC G AG3T CAGGAGGTTG 1080 

GAAACCATCC TGGCCGGAAT GGTGAAGCCC TGACTCTACT AGAAGTGCAA AG ATTGGCT' 3 1140 

GGTGTGGTGG CGTGCGCCTG TGGTCCCA3C TTCTCGGGAG GCTGAGGC 3G GAGAGTTGCT 1200 

20 

TGGGCCTGGG AGGCCGAGGT TGCGGTGAGC TGAGATCCTG CCATTGCACT TCAG0CTGO3 1260 

CACAGAGCCA GACTCTGGCT CAAAAAAAAA AAAAAAAAAA ACTCGAGGGG GGCCCGTACC 132 0 

25 CAATTCGCCG NATATGATCG TAAACAAT 1348 



30 (2) INFORMATION FOR SEQ ID NO: 177: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 1502 base pairs 

(B) TYPE: nucleic acid 
35 (C) STRANDEDNESS : double 

(D) TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 177: 

40 CTCAAAATAA ATAAATAAAT AAAAATTTGT ATTCCATTGA TTTGGGTAGA CACCAGGAAT 6 0 

GTGCATTTCT AACAAGCTTT CCAGGCGATC CTATAGTAAG TCATCTGTGG ACTACTTTAA 120 

GAAACTCTTC TATAGAGAAT GGAGTTGGAT TAATAATAGG TGATTTTTTA CACTGGACTG 180 

45 

ATTCACAAGA ACCTAAACAG TAGTCCATGA AGCTGCTCAT CTGTGGTAAC TATTTGGCCC 240 

CGTCTCACTC TGAAAGCAGC AGGAGATGTT GTTTACTTTG TTTCTATCCC CTTTGTCTGG 300 

50 AGATTAATTT TGGAATGAAA GTTTTTCTCT CTATGCCATT CCTGGTTCTT TTCCAAAGCC 360 

TCATACAAGA GGATTAGGTC ACAATGCATG CATTACCTTT TAAAAGAATG CGATATTGAT 420 

ACCGATGCTT ACTTTTTTTT TTTTTNACTA CTTGTTTTAT TCCTTCCAGN AAAGTATAGC 480 

55 

CCGCCTTTCT ATAGCATAGT TCTCTTTAGG TGGAATGATT CCTATAAGAT TTCTCATTAT 540 

TAAATCATGC ATTTTTCAAG ATGGAATCAA T^rrriGATTT AATCTAAGCT GATATTCTCA 60 0 

60 TTTGTTAGAA GAACAACCTA CATGCTAGAG AGAGAGGAGG AAATATACCC ACGACCACAC 660 
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20 
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30 



AGC CAGTTAG TATCCAGTTG GTGCTGGACT CCAGCCAGGT GTCCTSCCTC ATGGT AGTTA "'20 

AATGATATAT AGAAAAGGTA AATTTTTAAA GAAATATTTA TTAATATATT CCTATAAAAC 7 80 

ATTTTAAAGG TAACCACATA AAAATGGTTA ATTTTTCCAT TCCAAAGTAA ATGCTAAGCA 340 

TGTTTATTAA TGAAGCAGTA CTTCTGATTA GTATATGACA TTCTGAAGTT AATTAAACTC 900 

ATTGCACTAA ATGTGTCTTC CTTGGTATAG TGGAGGATTT GAGGATTGGA ATATAGAGTA 960 

GAGTGCTTGC TTAAGCCTGG 3AGCCCATCT TTATAGCTAT TTGATGTAAG AAAAGAGACA 1020 

TGGNCCATTT CTAAACTATA TAAGGTGAGT GTGTCTATTC CCAGCAGATA TAAAGGAAAA 108C 

AGGAAACTTT TTTGATTCCC ACCTTCCCAG CCTCACCTAG CCATCTTCCA GCCTCAAATA 1140 

TAGAGATGTT AGTGCAAGGT CCTGGGCTCT AGGTGATCAT TTCATAAGTC CTTTACAGAT 1200 

AAAGAAAAAG TAGTGTTTGT ATGTTTGTTT TTAAGTAACC CCAAAACAAA TTTATATTGT 1260 

ATTCAGCAAA ATTGGAATTC AGGTGTTTAA TTTTAGAACA TGAAGTGCCT GCTGTTTTAA 1320 

GCATTGACTT GTATAAAAAG AATTGCATGT CTCCAGTAAG CTTATGGGTT TTCTCATTTT 1380 

TAGGTATATG GCTTTTAATC ATGTAAAGTG AAACATTAGT TTTCTTGCAT TTTATTACAG 1440 

GTTCTTTGTT GCAATAAAGA TGCTGCTGAA ATTAATTGAA AAAAAAAAAA AAAAAAACTC 1500 

GA 1502 



35 (2) INFORMATION FOR SEQ ID NO: 178: 

(i) SEQUENCE CHARACTERISTICS : 

(A) LENGTH: 1637 base pairs 

(B) TYPE: nucleic acid 
40 (C) STRANDEDNESS : double 

(D) TOPOLOGY : linear 

(xi ) SEQUENCE DESCRIPTION: SEQ ID NO : 173: 

45 ATTTTCTAGC CCACAAGGAC TGAAGTTCAG ATCCAAAAGT TCACTTGCTA ATTATCTTCA 60 

CAAAAATGGA GAGACTTCTC TTAAGCCAGA AGATTTTGAT TTTACTGTAC TTTCTAAAAG 120 

GGGTATCAAG TCAAGATATA AAGACTGCAG CATGGCAGCC CTGACATCCC ATCTACAAAA 180 

50 

CCAAAGTAAC AATTCAAACT GGAACCTCAG GACCCGAAGC AAGTGCAAAA AGGATGTGTT 240 

TATGCCGCCA AGTAGTAGTT CAGAGTTGCA GGAGAGCAGA GGACTCTCTA ACTTTACTTC 300 

55 CACTCATTTG CTTTTGAAAG AAGATGAGGG TGTTGATGAT GTTAACTTCA GAAAGGTTAG 3 60 

AAAGCCCAAA GGAAAGGTGA CTATTTTGAA AGGAATCCCA ATTAAGAAAA CTAAAAAAGG 420 

ATGTAGGAAG AGCTGTTCAG GTTTTGTTCM AAGTGATAGC AAAAGAGAAT CTGTGTGTAA 480 

60 
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TAAAGCAGAT GCTGAAAGT3 AACCTGTTGC ACAAAAAAGT C AG CTTGA T A GAA CTGTCTG 64 0 

CATTTCTGAT GCTGGAGCAT GTGGTGAGAC CCTGAGTGTG AC CAGTGAAG AAAACAGCCT 60 3 

5 TGTAAAAAAA. AAAGAAAGA7 CATTGAGTTC AGGATCAAAT TTTTGTTC TG AACAAAAAAC 66 0 

TTCTGGCATC ATAAACAAAT TTTGTT C AGO CAAAG ACTC A GAACACAACG AGAAGTATGA 72 C 

■3GATACCTTT TTAGAATCTG AAGAAATCGG AACAAAAGTA GAAGTTGTGG AAAGGAAAGA 780 

10 

ACATTTGCAT ACTGACATTT TAAAACGTGG CTCTGAAATG GACAACAACT GCTCACCAAC 84 0 

CAGGAAAGAC TTCACTGAAG ATAGCATCCC ACGGAACACA GATAGAAAGA AGGAAAACAA 900 

15 GCCTGTATTT TTGCAGCAAA TATAACAAAG AAGCTCTTAG CCCCCCACGA CGTAAAGCCT 960 

TTAAGAAATG GACACCTCGT CGGTCACCTT TTAATCTCGT TGAAGAAACA CTTTTTCATG 1020 

ATCCATGGAA GCTTCTCATC GCTACTATAT TTCTCAATCG GACCTCAGGC AAAATGGCAA 1080 

20 

TACCTGTGCT TTGGAAGTTT CTGGAGAAGT ATCCTTCAGC TGAGGTAGCA AGAACCGCAG 1140 

ACTGGAGAGA TGTGTCAGAA CTTCTTAAAC CTCTTGGTCT CTACGATCTT CGGGC AAAAA 1200 

25 CCATTGTCAA GTTCTCAGAT GAATACCTGA CAAAGCAGTG GAAGTATGCA ATTGAGCTTC 1260 

ATGGGATTGG TGCACCCTGA AGACCACAAA TTAAATAAAT ATCATGACTG GCTTTGGGAA 1320 

AATCATGAAA AATTAAGTCT ATCTTAAACT CTGCAGCTTT CAAGCTCATC TGTTATGCAT 13 BO 

30 

AGCTTTGCAC TTCAAAAAAG CTTAATTAAG TACAACCAAC CACCTTTCCA GCCATAGAGA 1440 

TTTTAATTAG CCCAACTAGA AGCCTAGTGT GTGTGCTTTC TTAATGTGTG TGCCAATGGT 1500 

35 GGATCTTTGC TACTGAATGT GTTTGAACAT GTTTTGAGAT TTTTTTAAAA TAAATTATTA 1560 

TTTGACAACA ATCCAAAAAA AAAAAAAAAA AAAAAAAAAA AAAAAAAAAA AAAAAAAAAA 1620 

AAAAAAAAAA AAAAAAA 1637 

40 



(2) INFORMATION FOR SEQ ID NO: 179: 

45 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 2911 base pairs 

(B) TYPE : nucleic acid 

(C) STRANDEDNESS : double 
50 (D) TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTION : SEQ ID NO: 179: 

GGTG GTTTTT GTTCTGCAAT AGGCIGGCTTA GAGGGAGGGG CTTTTTCGCC TATACCTACT 50 

55 

GTAGCTTCTC CACGTATGGA CCCTAAAGGC TACTGCTGCT ACTACGGGGC TAGACAGTTA 120 

CTGTCTCAGC TCTAGGATGT GCGTTCTTCC ACTAGAAGCT CTTCTGAGGG AGGTAATTAA 180 

60 AAAACAGTGG AATGG AAAAA CAGTGCTGTA GTCATCCTGT AATATGCTCC TTGTCAACAA 240 
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TGTATACATT CCTGCTAGGT GCCATATTCA 
TGAAGTATTC TGCCAATGAA GAAAACAAGT 

5 

GCTCAGAACT GGTGAAGCTA GTnTCTGTG 
ATCATCAAAG TAGAAATTTG AAATATGCTT 
10 GGTCCATTCC TGCCTTTCTT TATTTCCTGG 
ATCTTCAACC AGCCATGGCT GTTATCTTCT 
TATTCAGGAT AGTGCTGAAG ANGCGTCTAA 

15 

TATTTTTGTC TATTGTGGCC TTGACTGCCG 
GACGTGGATT TCATCACGAT GCCTTTTTCA 
20 ATGAGTGTCC CAGAAAAGAC AATTGTACAG 
GGAACACCAC AGCCAGAGTT TTCAGTCACA 
TAGTCCAGTG TTTTATTTCT TCAATGGCTA 

25 

GGAACCAGCT CACTGAARGC ATCTTCATAC 
TGTTTAATGG GCTGACTCTG GGCCTTCAGA 
30 GATTTTTTTA TGGCCACAGT GCATTTTCAG 
GCCTTTCAGT GGCTTTCATT CTGAAGTTCC 
AGGTTACCAC TGTCATTATC ACAACAGTGT 

35 

TGGAATTTTT CTTGGAAGCC CCATCAGTCC 
AGCCTCAAGT TCCGGAATAC GCACCTAGGC 
40 TTTGGGAGCG TTCCAGTGGG GATGGAGAAG 
ATGAGTCAGA TGAAGATACT TTCTAACTGG 
TTATTTTCAC ATTTTCAGTG TTTGTAATAT 

45 

TTTCTAAATC CTAATATTCT TTGCATATAT 
GCTTAGAGTA CCCAAAGGCT AAGAAATTCT 
50 GAATTCATTA ATATCTCAGT ACTTGATAAA 
TTGGCCTTCA AGCTTCCAAA AAACTTGTAA 
CATAGAGATC AATTTGCCAA ATATTCACAA 

55 

TTCCCTTTTT AACATTATAA AAGCTAGGTT 
TCATTTTGCA AGTAAAGAGC AACGGGACCC 
60 TACCTGGCCA TACCATAGAT TTGGGATGAT 



TTGCTTTAAG CTCAAGTCGC ATCTTACTAG 3 CO 

ATGATTATCT TCCAACTACT GTGAATGTGT 35 0 

TGCTTGTGTC ATTCTGTGTT ATAAAGAAAG 420 

CCTGGAAGGA ATTCTCTGAT TTCATGAAGT 480 

ATAACTTGAT TGT'GTTCTAT GTCCTGTCCT 540 

CAAATTTTAG CATTATAACA ACAGCTCTTC 600 

AC TGGATCCA GTGGGCTTCC CTCCTGACTT 660 

GGACTAAAAC TTTACAGCAC AACTTGGCAG 720 

GCCCTTCCAA TTCCTGCCTT CTTTTCAGAA 780 

CAAAGGAATG GACTTTTCCT GAAGCTAAAT 840 

TCCGTCTTGG CATGGGCCAT GTTCTTATTA 900 

ATATCTATAA TGAAAAGATA CTGAAGGAAG 960 

AGAACAGCAA ACTCTATTTC TTTGGCATTC 1020 

GGAGTAACCG TGATCAGATT AAGAACTGTG 1080 

TAGCCCTTAT TTTTGTAACT GCATTCCAGG 1140 

TGGATAACAT GTTCCATGTC TTGATGGCCC 1200 

CTGTCCTGGT CTTTGACTTC AGGCCCTCCC 1260 

TTCTCTCTAT ATTTATTTAT AATGCCAGCA 132 0 

AAGAAAGGAT CCGAGATCTA AGTGGCAATC 13 80 

AACTAGAAAG ACTTACCAAA CCCAAGAGTG 1440 

TACCCACATA GTTTGCAGCT CTCTTGAACC 1500 

TTATCTTTTC ACTTTGATAA ACCAGAAATG 1560 

CTAGCTACTC CCTAAATGGT TCCATCCAAG 1620 

AAAGAACTGA TACAGGAGTA ACAATATGAA 1680 

TCAGAAAGTT ATATGTGCAG ATTATTTTCC 1740 

TAATCATGTT AGCTATAGCT 1X3TATATACA 1800 

TCATGTAGTT CTAGTTTACA TGCCAAAGTC 1360 

GTCTCTTGAA TTTTGAGGCC CTAGAGATAG 1920 

TTTCTAAAAA CGTTGGTTGA AGGACCTAAA 1980 

GTAGTCTGTG CTAAATATTT TGCTGAAGAA 2040 
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GCAGTTTCTC AGACACAACA TCTCAGAATT TTAATTTTTA GAAATTCAPG GGAAAT7GGA 2 ICC 

TTTTTGTAAT AATCTTTTGA TGTTTTAAAC ATTGGTTCC C TAGTCACCAT AGTTACCACT 216 0 

5 

TGTATTTTAA GTCATTTAAA CAAGCCACGG TGGGGCTTTT TTCTC CTCAG TTTGAGGA3A 2220 

AAAATCTTGA TGTCATTACT C CTGAATT AT TACATTTTOG AGAATAAGAG 'GGCATTTTAT 228C 

10 TTTATTAGTT ACTAATTCAA GCTGTGACTA TTGTATATCT TTCCAAGAGT TGAAATGCTG 2340 

GCTTCAGAAT CATACCAGAT TGTCAGTGAA GCTGATGCCT AGGAA2TTTT AAAGGGATGC 240 3 

TTTCAAAAGG ATCACTTAGC AAACACATGT TGACTTTTAA CTG AT UTATG AATATTAATA 2460 

15 

CTCTAAAAAT AGAAAGAGCA GTAATATATA AGTCACTTTA CAGTO2TACT TCACAGTTAA 2 520 

AAGTGCATGG TATTTTTCAT GGTATTTTGC ATGCAGCCAG TTAACTCTCG TAGATAGAGA 2580 

20 AGTCAGGTGA TAGATGATAT TAAAAATTAG CAAACAAAAG TGACTTGCTC AGGGTCATGC 2640 

AGCTGGGTGA TGATAGAAGA GTGGGCTTTA ACTGGCAGGC CTGTATGTTT ACAGACTACC 27 00 

ATACTGTAAA TATGAGCTTT ATGGTGTCAT TCTCAGAAAC TTATACATTT CTGCTCTCCT 2760 

TTCTCCTAAG TTTCATGCAG ATGAATATAA GGTAATATAC TATTATATAA TTGATTTGTG 2820 

ATATCCACAA TAATATGACT GGC AAGAATT GGTGGAAATT TGTAATTAAA ATAATTATTA 2 880 

30 AAGCTAAAAA AAAAAAAAAA AAAAACTCGA G 2911 

35 (2) INFORMATION FOR SEQ ID NO: 180: 

(i) SEQUENCE CHARACTERISTICS : 

(A) LENGTH: 519 base pairs 

(B) TYPE : nucleic acid 
40 (C) STRANDEDNESS : double 

(D) TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTION : SEQ ID NO: 180: 
45 GGCACGAGCC CCAGGCCAGC CAGGGCCAGG CCTACTTTGG CCACCCTTAA ATTAGAATGT 
GGGGTCAGGG GTCACAGAAA AGCCATTTCT CTGACCTAGT GTTTGGCGTC CGGGAACTCT 
GTGCCCAACC TTCAGACCCT GGCAGTCCTC ACTGAGGCCA TTGGCCCAGA GCCCGCCATC 
CCCCGARACC CCCGGGAGCC GCCTGTTGCC ACGTCCACAC CTGCCACACC CTCTGCCGGG 
CCCCAGCCCC TCCCAACCGG GACCGTGCTG GTCCCTGGGG GTCCTGCCCC ACCTTGCCTT 
55 GGGGAGGCAT GGGCCCTCCT CCTCCCACCC TGCCGGCCGT CACTCACCTC TTGCTCCTGG 
TCCCCCAGGC CTAGCCCTTG GAAGGAGACA GGAGTCTAGG GAGG2TGAAG CCCACTCCCG 
GGGAGGCCCG TGCTCCTCCA GCCCCAGGGA CAGCAAGGAA AAGAGAAGAG AGC AGAGC AT 

60 



50 



60 
120 
180 
240 
300 
360 
420 
480 
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TTCATGGCTC TAATAAAAAA. AAAAAAAAAA AAAACTCSA 



519 



(2) INFORMAT I ON" FOR SEQ ID NO : 13:: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH : 963 base pairs 
10 (3) TYPE : nucleic acid 

(C) STRAND EDME 3 S : double 

(D) TOPOLOGY: linear 



15 



25 



35 



45 



50 



(2) INFORMATION FOR SEQ ID NO: 182: 



55 ( i ) SEQUENCE CHARACTER ISTICS ; 

(A) LENGTH: 1128 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : double 

(D) TOPOLOGY: linear 

60 



240 
300 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 181: 
TCCCCTTGGG GCCGGAAAAA GCGGGGTTGG CCTGNCCATT GGTTNTC C AT GCCGCCCGCC 60 
CATGCCCCAG TACTAGCCTG CAGTCCCAAT GTAGCCCCTC CCTCYTCCMA GAGCCCYTCM 120 
20 AACCGCCCCG STCANTTGTG ATTTCAGGAG GATTTGATGA AGATGTTAAA GCGAAAGTGG 180 
AGAACCTTCT CGGGATTTCC AGCCTGGAAA AAACGGACCC TGTTAGGCAA GCACCCTGCA 
GCCCTCCCTG TCCCCTTCTT CCCCTCCCCT TCYCCCGCCC GTGGAGACAG CTGTTYTCAG 
CAGGGCTCTC CGCAGGGAGG GGGCCGGCTC CTTCCCTGGC AGCAACATCC TTGCCCTTGT 360 
CACACAAGTC AGCCTCCATC TGCGCAGCTC TGTGGATGCG CTGCTGGAGG GC AACAGGTA 420 
30 TGTCACTGGC TGGTTCAGCC CCTACCACCG CCAGCGGAAG CTCATCCACC CGGTCATGGT 480 
TCAGCACATC CAGCCCGCAG CGCTCAGCCT CCTGGCACAG TGGAGCACCC TCGTGCAGGA 540 
GCTGGAGGCT GCCCTGCAGC TGGCTTTCTA CCCGGATGCC GTGGAGGAGT GGCTGGAGGA 600 
AAACGTGCAC CCCAGCCTGC AGCGGCTGCA ARCTCTGCTG CAGGACCTCA GCGAGGTGTC 660 
TGCCCCCCCG CTGCCACCCA CCAGCCCTGG CAGGGACGTT GCTCAGGACC CCTGAGGGGA 
40 GAGCTCATGC CAGGGGGCTC CTGCTGGAGG CTGGGGGGGC TCTGCWYTKY CWWWTGGCCT 
GGGCAATACG GCCCACGTGG GCGTCGTGCC CTCTGGCCCA GCAGTGTCTT GCCCACACTC 
AGTTCCTGAG GGCCCTGGGC AGCCCCTGGG GGAGAGACTA GAAAACACAG AAGGAAGCAG 900 
CACAGGGAGA CCCGCTTTGT GATCTGCATG TGTGACACTG ATTCTTTGGA AATAAAGAGT 960 
GGAAGCTG 
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(:<l I SEQUENCE DESCRIPTION*: SEQ ID MO: 132: 
TGTAAAAGTT ATCAGTAATC CTAATTCTTT TCCTGGGTTT TCCTTTTGTC ACTTATTAAT SC 
CAGTTTTTGA AAGGACGAAT GAATTT AG A G ATGTACTCTG GAGCAGTATC ATGTTAAACC 120 
AGGGGTATAT TAGAAAAATC ATCCTCATAA TCATTCTGGG AAGTTTTTCC TCCCCAAAAA 
AAGCCATCCT GATGGGTTTT CAAAACCAGA AAAAAGCTCT TAATGAGGAA CAGACCACTG 



(2) I NFORMAT ION FOR SEQ ID NO: 183: 



130 
240 



GAGTACCCAT GAGCATCTCA GGAAAACTGA GACCCTCGAG AAGCCTTGAT TTCGTGCAAC " 300 



360 



CCCCAAGGTT TCAGAGCCAG CAGCCCAGTG CTGTGGTTGA CAGACGTGGT TTTXTGGRGA 
15 AAGCAGCCAG AGGCCAGGAA TTTTCAGAGT CGTGAGTCAC GRTYTCCCAC CCAAGATTAG 420 
AGCAMAGATT AGCCATACTG AGATTTGGTA AAATCATTCT GTCTAAGCAA TGGAGGTGTG 480 
TGCAMACGTG CAGTGCCTGT TCACAGGGGA TGCAGGCAGA TCSYGGGTTT AGGATGGGGR 540 

20 

AGGCCACCGC ACCCCCYTTC AYTGCTCTGC ACCTGCTCCC TCACGTGGAC ACTGTCCACA 600 
ACTGTGGCTC TCACAGGAC A GTTGCCCAAG GAGCTCATAT CTTATTGGAG ATAGGGGGTC 660 
25 GTACAGGTGA CATTCATGAG CAGTGTGAGC CGGGTGACAT GGGGGTGTCA ACCCAGCATC 720 
TGTCCAGGAG CTCCTCCTGC AGCGGCTCTG GCAGGTGGCC TGAGGCTCCT TTTTGAGAGA 
GAACTGTTTG GCCTTCCTGT CTCCTCTCCT CTGATCTGTT CTTTCTTGGA ACACCACCCA 
AGAACGTCAC CTCCTCCATC AGATTGTGAG CTCCTGGAGG GCAGGAGCTG TGTCCTTCTA 
TTCATCTTCC TATCCCCAGA ACCTTGCACA GATCCTGGAA TGTGGTAGGT GCTCAGTAAA 
35 TGTGTGTTGA ATAAATGAAT GAATGAATGA ACAAATGAAT GAATTTGCTT ACTTCAAGGC 1020 
AAAAGAACCA TGAAACTGTA TTTTGAGTTT CTATGTTATA GCAGTCAGCA AATCCTATTA 1080 
AATACTTTGT GTTTCCAAGC AAAAAAAAAA AAAAAAAAAA AAACTCGA 1128 



780 
840 
900 
960 



{i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 2276 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : double 
50 (D) TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 183: 

CCGCGGCGTC TGACCTCATG GCGTAGAGCC TAGCAACAGC GCAGGCTCCC AGCCGAGTCC 

GTTATGGCCG CTGCCGTCCC GAAGAGGATG AGGGGGCCAG CACAAGCGAA ACTGCTGCCC 

GGGTCGGCCA TCCAAGCCCT TGTGGGGTTG GCGCGGCCGC TGGTCTTGGC GCTCCTGCTT 

60 GTGTCCGCCG CTCTATCCAG TGTTGTATCA CGGACTGATT CACCGAGCCC AACCGTACTC 240 



6C 
12j 
130 
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AACTCACATA TTTCTACCCC AAATGTGAAT GCTTTAACAC ATG AAAAC CA AAC CAAAC CT 



300 



TCTATTTCCC AAATCAGCAC CACCCTCCCT CCCACGACGA GTACCAAGAA AAGTGGAGGA 36C 



15 



25 



35 



45 



55 



GCATCTGTGG TCCCTCATGC CTCGCCTACT CCTCTGTCTC AAGAGGAAGC TGATAACAAT 
GAAGATCCTA GTATAGAGGA GGAGGATCTT CTCATGCTGA ACAGTTCTCC ATCCACAGCC 



TTTTTATCTG TACTTTTAGA GCTGAGTTTA ATCAGGTGTC CAAAATGTGA GTTAAACATT 



420 
480 



10 AAAG AC ACTC TAGACAATGG CGATTATGGA GAACCAGACT ATGACTGGAC CACGGGCCCC 540 



600 



AGGGACGACG ACGAGTCTGA TGACACCTTG GAAGAAAACA GGGGTTACAT GGAAATTGAA 
CAGTCAGTGA AATCTTTTAA GATGCCATCC TCAAATATAG AAGAGGAAGA CAGCCATTTC 660 
TTTTTTCATC TTATTATTTT TGCTTTTTGC ATTGCTGTTG TTTACATTAC ATATCACAAC 720 
AAAAGGAAGA TTTTTCTTCT GGTTCAAAGC AGGAAATGGC GTGATGGCCT TTGTTCCAAA 780 
20 ACAGTGGAAT ACCATCGCCT AGATCAGAAT GTTAATGAGG CAATGCCTTC TTTGAAGATT 
ACCAATGATT ATATTTTTTA AAGCACTGTG ATTTGAATTT GCTTATGTAA TTTTATTTGC 
TTGACTTTTT ATATGATATT GTGCAAATGT TTGCCATAGG CAATTGGTAC TTAAATGAGA 
GGTGAGTCTC TCTTTTGCCT TGGTGCTTTG GAAATTAAAT GTCACAAACG AGTATATAAT 1020 



840 
900 
960 



1080 



30 ACCTTATATT TACACTGTTA GTTTTTATTG TTTTAGATTT ATTATGCTTC TTCTGGAAGT 1140 



ATTAGTGATG CTACTTTTAA AAGATCCCAA ACTTGTAACT AAATTCTGAC ATATCTGTTA 1200 

CTGCTGACTC ACATTCATTC TCCGCCATTC AAATACTATT TTTTATCCAC ATTTTTTTTT 1260 

GTTCCCAAAC TGTAATGTAC AAGGATATGT GTGATAATGC TTTGGATTTG AGTAATATTT 1320 

TTTTTTCTTC CAAGAAAACT GCTTTGGATA TTTTTAGATA ATTTAAACAT AATTTAGGAT 1380 

40 AATGATATTG CTCAATCTGA CCACAATTTT AGGTAAAACA TTAAATGTGT CAGAAATCTT 1440 

GGC AACAGAG ACTCTGCAGC TTGCAGTGGA CATAGATAAA ATGTTACAGA GATACTATTT 1500 

TTTTGGTTGG AATTACTATA TTAAATTTAG AAGCAGAAAC TGGTAAAATG TTAAATACAT 1560 

GTACAATTGC TTTTAGTTAG CAATTGATTG TAGCATGGGT TCCTCCAAGG TTTCAAGCAA 1620 

TGGGCAGAGT TTAAAA1TAT ATCAGATTCG TTTACTTCGT TTATTATTTT ACAGTAAATT 1680 
50 TGAATAAATC TTAGGGGTCA TTATCACTTA AATAATACTG TACCTAGGTC TTTCAAATTA 
AAATTATACC TGAATGAAGT TGTTTGTATA CATAAAGGAT ATTTGTGTAC AATTACCTTT 

nTCCCCCAC ACTIGTT*rrC TT T GTTT T TG TTTTTTATGG CAACTGGAAA GTATTTACTA I860 

TGGGATTCAT TTATGTCTGT CTTTCTATCA TAAAGAATTG ATCAATATGT AAATATGTGA 1920 
TTTGAACCAT GGTTGACTTA CAAGTGTCAC TACAGCTTTT TAGAAAACAT AGCCCTAATA 
60 TATGTTAAGC AGGACCCGGG TGAGCCAGTG GGCTTGCGCT TTATGTAGAG CTGGAAGAAG 



1740 
1800 



1980 
2040 
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GCCGTCCATC CTGTCTCTTG GGCGGACAGT GTACTTTCCT AATAI}G<JAAG GGAAGCACAA 
TGGAAATACC CCTGAACCGT TTTATTGCAG TAATTTTTTT CATATC TGAA ACT ATT ATTT 
AATATTTTGA ATAAGATTTT AAAAAATAAA TGGCAAAGAT AT AAATGT AA AAAAAAAAAA 
AAAAAAAAAA AAAAAAAAAA AAAAAAAAAA AAAAAAAAAA AAAAAAAAAA AAAAAA 

(2) INFORMATION FOR SEQ ID NO: 184: 



30 



40 



50 



60 



TCTTGCTAGA ATGAAAATTC CTGAGACCCT TGAAGAAGAT CAGCAATTCA TGCTAAAAAA 
GTGTCCTGCC CTACTTCAAG AAATGGTTAA TGTAATCTGC CAACTAATAG TAATGGCCCG 



21CC 
2160 
2220 
22^6 



15 (i) SEQUENCE CHARACTERISTICS : 

(A) LENGTH: 2500 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : double 

(D) TOPOLOGY: linear 

20 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 194: 
TCCAAGCTAC GCCACTCGGG CTGGGGCGTT GGGAGCGGGA GTGCAGAGCG TGGTCGTGGC 60 
25 GGCGGCGGTG AGAAGAGCGA GGCGKAGGAG GGGGTGCCAT GGCCGGGCAG CAGTTCCAGT 120 
ACGATGACAG TGGGAACACC TTCTTCTACT TCCTCACCTC CTTCGTGGGG CTCATCGTGA 180 
TCCCGGCGAC ATACTACCTC TGGCCCCGAG ATCAGAATGC CGAGCAAATT CGATTAAAGA 240 
ATATCAGAAA AGTATATGGA AGGTGTATGT GGTACGTTTA CGGTTATTAA AACCCCAGCC 300 
AAATATTATT CCTACAGTAA AGAAAATAGT TCTGCTTGCA GGATGGGCAT TGTTCTTATT 3 60 

35 CCTTGCATAT AAAGTTTCCA AAACAGACCG AGAATACCAA GAATACAATC CTTATGAAGT 420 
ATTAAATTTG GATCCTGGAG CCACAGTAGC AGAAATTAAA AAACAATATC GTTTGCTGTC 480 
ACTTAAATAT CATCCAGATA AAGGAGGTGA TGAGGTTATG TTCATGAGGA TAGCAAAAGC 540 



600 



TTATGCTGCT TTAACGGATG AAGAGTCCCG GAAAAATTGG GAAGAATTTG GAAATCCAGA 
TO3GCCTCAA GCCACAAGCT TTGGAATTGC CCTGCCAGCT TGGATAGTTG ACCAGAAAAA 660 
45 TTCAATTCTG GTTTTACTTG TATATGGATT GGCATTTATG GTTATCCTTC CAGTTGTTGT 
C^TCTTGG TGGTATCGCT CAATACGCTA TAGTGGAGAC CAGATTCTAA TACGSACAAC 
ACAGATTTAT ACATACTTTG TTTATAAAAC CCGAAATATG GATATGAAAC GTCTTATCAT 840 
GCTTTTGGST C^GCTTCTG AATTTGATCC TCAGTATAAT AAAGATGCCA CAAGCAGACC 
AACGGATAAT ATTCTAATAC CACAGCTAAT CAGAGAAATT GGCAGCATTA ATTTAAAGAA 
55 GAATGAGCCT CCACTTACCT GCCCATATAG CCTGAAGGCC AGAGTTCTTT TACTGTCTCA 1020 



720 
780 



900 
960 



1080 
1140 
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GAACCGT3AA GAAAGGGAGT TTCGTGCTCC AACTTTGGCA TCCCTAGAAA ACTGC A7G AA 


1.100 




GCTTTCTCAG ATQ3CCGTTC AGGGACTTCA GCAATTTAAG TCTCCCC7TC TGCAGCTCCC 


1260 


5 


TCATATTGAA GAGGACAATC TTAGAGGGGT TTCTAATCAT AAGAAGTATA AAATTAAAAC 


1320 




TATCCAGGAT TTOGTGAGTT TAAAAGAATC AGATCGTCAC ACTCTACTGC ACTTCCTTGA 


133C 


10 


AGATGAAAAA TATGAAGAGG TTATGGCTGT CCTTGGGAG7 TTTCCATATG TGACCATGGA 
TATAAAATCA CAGGTGTTAG ATGA7GAAGA TAGCAACAAC ATCACAGTAG GATCCTTAGT 


1440 
1530 




TACAGTGTTG GTTAAGTTGA CAAGGCAAAC AATGGCTGAA GTATTTGAAA AGGAGCAGTC 


1560 


15 


CATCTGTGCT GCAGAGGAAC AGCCAGCAGA AGATGGGCAG GGTGAAACTA ACAAGAACAG 


1620 




GACAAAAGGA GGATGGCAAC AGAAGAGTAA AGGACCCAAG AAAACTGCTA AATCAAAAAA 


1680 


20 


AAAGAAACCT TTAAAAAAAA AACCTACACC TGTGCTATTA CCACAGTCAA AGCAACAGAA 
ACAAAAGCAG GCAAATGGAG TCGTTGGGAA TGAAGCTGCA GTAAAGGAAG ATGAAGAAGA 


1740 
1800 




AGTTTCAGAT AAGGGCAGTG ATTCTGAAGA AGAAGAAACC AATAGAGATT CCCAAAGTGA 


1860 


25 


GAAAGATGAT GGTAGTGACA GAGACTCTGA TAGAGAGCAA GATGAAAAAC AAAACAAAGA 


1920 




TGATGAAGCA GAGTGGCAAG AATTACAACA AAGCATACAG CGAAAAGAGA GAGCTCTATT 


1980 


30 


GGAAACCAAA TCAAAAATAA CACATCCTGT GTATAGCCTT TACTTTCCTG AGGAAAAACA 
AGAATGGTGG TGGCTTTACA TTGCAGATAG GAAGGAGCAG ACATTAATAT CCATGCCATA 


2040 
2100 




TCATGTGTGT ACGCTGAAAG ATACAGAGGA GGTAGAGCTG AAGTTTCCTG CACCAGGCAA 


2160 


35 


GCCTGGAAAT TATCAGTATA CTGTGTTTCT GAGATCAGAC TCCTATATGG GTTTGGATCA 


2220 




GATTAAACCA TTGGAAGTTK GGAAGI lv_Ai ualaj^ iumu ll i\jivj^-A_rtvj 


2280 


40 


ACAGTGGGAT ACAGCAATAG AGGGGGATGA AGACCAGGAG GACAGTGAGG GCTTTGAAGA 
TAGCTTTGAG GGAGGAAGAG GGAGGGAGGA AGGAAGGTGG TGGACTTAAG GCAGTTACTC 


2340 
2400 




TGGAATGGGA CCCACAGTGT TTTGCACCAT ATTTTGGCAA TTTTTTTTGC CCGTTTTTNG 


2460 


45 


GAAGTGTTTT CCT7TNAANCC CAGGAACCAT TACAGAACCG 


2500 




Hi T , KTT?/^\D»jtR r r TAM mo CT?0 7"H MO- 1 3 ^ • 
(J) ^NrUKMAI I UN rUK i>r.y J.U nu. ioj. 




55 


(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 1337 base pairs 
(3) TYPE: nucleic acid 

(C) 3TRANDEDNESS : double 

(D) TOPOLOGY: linear 

(Xi) SEQUENCE DESCRIPTION: SEQ ID NO: 185: 




60 


CTTCCGGTTC TCCGGGCAGC TGCC^CTGCT GTAGCTTCTG CCAGCTGCCA CGACCGGGCC 


60 
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(2) INFORMATION FOR SEQ ID NO: 186: 



(i) SEQUENCE CHARACTERISTICS : 

(A) LENGTH: 941 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : double 
55 (D) TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 186: 

GGCACGAGCC TGGACGCAGC AGCCACCGCC GCGTCCCTCT CTCCACGAGG CTGCCGGCTT 

60 



120 
130 
240 
303 
360 



TCTCCCTGGC GTTTGGTCAC CTCTGCTTCA TTCTCCACCG C3CCTATGGT CCC'rCTTGGA 
GCCAGCOTGG CGGGCCTGGC GGCTCCCGGG TGGTGAGAGA GCGGTCC'ZGG AACGATGAAG 

S 

GCCTCGCAGT GCTGCTGCTG TCTCASCZAC CTCTTGGCTT CCGTCCTCCT OCTGCTGTTG 
CTGCCTGAAC TAAGCGGGYC CCTGGMAGTC CTGCTGCAGG CAGCCGAGGC CGCGCCAGGT 
10 CTTCGGCCTC CTGACCCTAG ACCACGGACA TTACCGCCGC TGCCACCGGG CCCTACCCCT 

GCCCA3CAGC CGGGCCGTOG TCTGGCTGAA GCTGCGGGGC CGCGGGGCTC CGAGGGAGGC 420 

AATGGCAGCA ACCCTGTGGC CGGGCTTGAG ACGGACGATC ACGGAGGGAA GGCCGGGGAA 4*0 

GGCTCGGTGG GTGGCGGCCT TGCTGTGAGC CCCAACCCTG GCGACAAGCC CATGACCCAG 540 

CGGGCCCTGA CCGTGTTGAT GGTGGTGAGC GGCGCGGTGC TGGTGTACTT CGTGGTCAGG 600 

20 ACGGTC AGG A TGAGAAGAAG AAACCGAAAG ACTAGGAGAT ATGGAGTTTT GGACACTAAC 660 

ATAGAAAATA TGGAATTGAC ACCTTTAGAA CAGGATGATG AGGATGATGA CAACACGTTG 7 20 

TTTGATGCCA ATCATCCTCG AAGATAAGAA TGTGCCTTTT GATGAAAGAA CTTTATCTTT 780 

CTACAATGAA GAGTGGAATT TCTATGTTTA AGGAATAAGA AGCCACTATA TCAATGTTGG 840 

GGGGGTATTT AAGTTACATA TATTTTAACA ACCTTTAATT TGCTGTTGCA ATAAATACCG 900 

30 TATCCTTTTA TTATATCTTT ATATGTATAG AAGTACTCTR TTAATGGGCT CAGAGATGTT 960 

GGGGATAAAG TATACTGTAA TAATTTATCT GTTTGAAAAT TACTATAAAA CGGTGTTTTC 1020 

TGATCGGTTT TTGTTTCCTG CTTACCATAT GATTGTAAAT TGTTTTATGT ATTAATCAGT 1080 

TAATGCTAAT TATTTTTGCT GATGTCATAT GTTAAAGAGC TAT AAATTC C AACAACCAAC 1140 

TGGTGTGTAA AAATAATTTA AAATTTCCTT TACTGAAAGG TATTTCCCAT TTTTGTGGGG 1200 

40 AAAAGAAGCC AAATTTATTA CTTTGTGTTG GGGTTTTTAA AATATTAAGA AATGTCTAAG 1260 

TTATTGTTTG CAAAACAATA AATATGATTT TAAATTCTCT TAAAAAAAAA AAAAAAAACC 132 0 

CCGGGGGGGG GCCCGGN 1337 

45 



60 



10 



20 



30 
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45 



55 



WO 98/54963 T ! 'S98 i 1 4 1 2 

AGGACCCCCA GCTCCGACAT GTCGCCCTCT 3G7CGCCTOT GTCTTCTCAC CATCGTTGGC 12 2 

CTGATTCTCC CGAGGAGAGG A'GAGA GGTTG AAAGATACCA CGTCCAGTTC TTCAGCAGAC 18C 

TCAACTATCA TGG AC ATT CA GGTCCCGACA CGAGCCCCAG ATGCAGTCTA CACAGAACTC 240 

CAGCCCACCT CTCCAACCCC AACCTGGCCT GCTGATGAAA CACCACAACC CCAGACCCAG 30 D 

ACCCAGCAAC TGGAAGGAAC GGATGGGCCT CTAGTGACAG ATCCAGAGAC AC ACAAGAGC 360 

ACCAAAGCAG CTCATCCCAC TGATGACACC ACGACGCTCT CTGAGAGACC ATCCCCAAGC 420 

ACAGACGTCC AGACAGACC C CCAGACCCTC AAGCCATCTG GTTTTCATGA GGATGACCCC 48 0 

15 TTCTTCTATG ATGAACACAC CCTCCGGAAA CGGGGGCTGT TGGTCGCAGC TGTGCTGTTC 540 

ATC ACAGGC A TCATCATCCT CACCAGTGGC AAGTGCAGGC AGCTGTCCCG GTTATGCCGG 600 

AATCATTGCA GGTGAGTCCA TCAGAAACAG GAGCTGACAA CCTGCTGGGC ACCCGAAGAC 660 

CAAGCCCCCT GCCAGCTCAC CGTGCCCAGC CTCCTGCATC CCCTCGAAGA GCCTGGCCAG 72 0 
AGAGGGAAGA CACAGATGAT GAAGCTGGAG CCAGGGCTGC CGGTCCGAGT CTCCTACCTC 
25 CCCCAACCCT GCCCGCCCCT GAAGGCTACC TGGCGCCTTG GGGGCTGTCC CTCAAGTTAT 
CTCCTCTGYT AAGACAAAAA GTAAAGCACT GTGGTCTTTG CAAAAAAAAA AAAAAAAAAA 



(i) SEQUENCE CHARACTERISTICS : 

(A) LENGTH: 654 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : double 
40 CD) TOPOLOGY : linear 



780 
840 
900 



AAAAAAAAAA AAAAAAAAAA AAAAAAAAAA AAAAAACTCG A 941 



(2) INFORMATION FOR SEQ ID NO : 187: 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 187: 

GAATTCGGCA CGAGGCAGCT TGTGCTTTAA AGGAGGTGTT CAAAGCATGT CTGAGCAGAG 60 

ACITTTGGGC TCTGTTTTAA TTAATACTTT AAAATAATTC ATATTTAAAA TATCARATGT 120 

TTCCATAAAG AGGAGGATGT TTAAATGCCT CCAGACTACA TTCCTTTTTA TTSCTTGATT 180 

50 TTACCTGGGA GTCCAAAGTT CAATTCCCAT AAAGCAAGCG TTTTATTTGT CACTTTCAAT 240 

ATACATCCGA TTGCCATGCT TAAGATGCAA TATGGGCTGC GGAAATAGGT TAACCCACAG 300 

GCTCCCAGGG CCCAGTGTAG AAGGTGAGAG ATTCGTGTAA AATGATTCAA ATAAAAGGAA 360 

GACCCTGGCC GGGTGCCGTA RCTCACGCCT GTAATCCCAG CACTTTGGGA GGCCGAAGCG 420 
AGTGGATGAC GAGGTTAGGA GTTGGAGACC AGCCTGGCCA ACATCGTGAA ACCCCGTCTC 
60 TACTAAAAAT ACAAAAATTA GCCGGGCATG GTGGCAGGCA CCTGTAATCC TAGCTAGTTG 



480 
540 
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440 
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GGAGGCTGAG GCAGGAGAAT CGTTTGAATC TGGGAGTTGG AGGTTGTCAG TGAGCTGAGA 63 C 

TCGCGCCACA GCACTCCAGC CTGGGTGACA GGGTGAGACT CT 3TCTCAAA NAGA 654 



10 



(2) INFORMATION FOR SEQ ID NO: 183: 



(l) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 1348 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : double 
15 (D) TOPOLOGY : linear 



20 



40 



50 



o 



(Xi) SEQUENCE DESCRIPTION : SEQ ID NO: 188: 
GAAACTGGAC CGGAGAACCG GAGCGAAGCG AAGCGGAAGC CC-3GAATGAG GCCGGACTGG 60 
AAAGCCGGAG CGGGGCCAGG CGGGCCTCCC CAAAAGCCTG CCZCTTCATC CC AGCGGAAA 12 0 

CCGCCGGCCC GGCCGAGCGC GGCGGCCGCT GCGATTGCAG TCGCGGCGGC GGAGGAAGAG 
25 AGACGGCTCC GGCAGCGGAA C CGCCTGAGG CTGGAGGAGG ACAAACCGGC CGTGGAGCGG 240 
TGCTTGGAGG AGCTGGTCTT CGGCGACGTC GAGAACGACG AGGACGCGTT GCTGCGGCGT 300 
CTGCGAGGC C C3AGGGTTCA AGAACATGAA GACTCGCGTG ACTCAGAAGT GGAGAATGAA 360 

30 

GCAAAAGGTA ATTTTCCACC TCAAAAGAAG CCAGTTTGGG TGGATGAAGA AGATGAAGAT 420 
GAGGAAATGG TTGACATGAT GAACAATCGG TTTCGGAAGG ATATGATGAA AAATGCTAGT 480 
35 GAAAGTAAAC TTTCGAAAGA CAACCTTAAA AAGAGACTTA AAGAAGAATT CCAACATGCC 540 
ATGGGAGGAG TACCTGCCTG GGCAGAGACT ACTAAGCGGA AAACATCTTC AGATGATGAA 6 00 

AGTGAAGAGG ATGAAGATGA TTTGTTGCAA AGGACTGGGA ATTTCATATC CACATCAACT 660 
TCTCTTCCAA GAGGCATCTT GAAGATGAAG AACTGCCAGC ATGCGAATGC TGAACGTCCT 7 20 

ACTGTTGCTC GGATCTCCAT CTGTGCAGTT CCATCCCGGT GCACAGATTG TGATGGTTGC 7 80 

45 TGGGATTAGA TAATGCTGTA TCACTATTTC AGGTTGATGG GAAAACAAAT CCTAAAATTC 
AGAGCATCTA TTTGGAAAGG TTTCCAATCT TTAAGGCTTG TTTTAGTGCT AATGGGGAAG 
AAGTTTTAGC CACGAGTACC CACAGCAAGG TTCTTTATGT CTATGACATG CTGGCTGGAA 
AGTTAATTCC TGTGCATCAA GTGAGAGGTT TGAAAGAGAA GATAGTGAGG AGCTTTGAAG 



840 
900 
960 
1020 



TCTCCCCAGA TGGGTCCTTC TTGCTCATAA ATGGCATTGC TGGATATTTG CATTTGCTAG 1080 



55 CAATGAAGAC CAAAGAACTG ATTGGAAGCA TGAAAATTAA TGGAAGGGTT GCAGCATCCA 1140 

CATTCTCTTC AGATAGTAAG AAAGTATACG CCTCTTCGGG G3ATGGAGAA GTTTATGTTT 1200 

GGGATGTGAA CTCAAGGAAG TGCCTTAACA GATTTGTTGA T3AAGGCAGT TTATATGGAT 1260 

60 



10 



k\ ( ) 98 54%3 A9& 1 1 ^98 11 422 

441 



132C 
13 30 
1440 
15C0 



TAAGCATTGC CACATCTAGG AATGGACAGT ATGTTGCTTG TGGTTCTAAT TGTGGA-3TGG 
TAAATATATA CAATCAAGAT TCTTGTCTCC AAGAAACAAA CCCAAAGCCA ATAAAAGCTA 
TAATGAACTT GGTTACAGGT GTTACTTCTC TGACCTTCAA TCCTACTACA GAAATCTTGG 
CAATTGCTTC AGAAAAAATG AAAG AAGC AG TCAGATTGGT TCATCTTCCT TCCTGTACAG 
TATT7TCAAA CTTCCCAGTC ATTAAAAATA AGAATATTTC TCATGTTCAT ACCATGGATT .1560 
TTTCTCCGAG AAGTGGATAC TTTGCCTTGG GGAATGAAAA GGGCAAGGCC CTGATGTATA 1620 
GGTTGCACCA TTACTCAGAC TTCTAAAGAG ACTATTTGAA GTCCAGTTGA GTCACAAGAG 
15 AAGCCTGTCT TGATATATCA TCTCAGAAAC TTTCCTGAAT ATGTGATAAT ATATGGAAAA 

TGATTTATAG ATCCAGCTGT GCTTAAGAGC CAGTAATGTC TTAATAAACA TGTGGCAGCT 1800 
TTTGTTTGAA AAAAAAAAAA AAAAAAAAAA AAAAAAAAAA AAACTCGA 1848 

20 



1680 
1740 



25 



(2) INFORMATION FOR SEQ ID NO : 189: 



(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 1146 base pairs 

(B) TYPE : nucleic acid 

(C) STRANDEDNESS : double 
30 ( D ) TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 189: 

AAAAAAAACC CAGGGGAACN TTGGGGGCCG CTTTONNTTC CCCCTCCAGG CCATTGGGGA 60 

35 

ATTCTTCAAG TTAATCCTGC TTTGCTCTTG GCCAACAGGG CTTGTAGGGG GGAGAGACCC 12 0 

AGGATCATCA AGGGGTTCGA GTGCAAGCCT CACTCCCAGC CCTGGCAGGC AGCCCTGTTC 
40 GAGAAGACGC GGCTACTCTG TGGGGCGACG CTCATCGCCC CCAGATGGCT CCTGACAGCA 
GCCCACTGCC TCAAGCCCCG CTACATAGTT CACCTGGGGC AGCACAACCT CCAGAAGGAG 
GAGGGCTGTG AGCAGACCCG GACAGCCACT GAGTCCTTCC CCCACCCCGG CTTCAACAAC 360 
AGCCTCCCCA ACAAAGACCA CCGCAATGAC ATCATGCTGG TGAAGATGGC ATCGCCAGTC 
TCCATCACCT OSXTGTGCG ACCCCTCACC CTCTCCTCAC GCTGTGTCAC TGCTGGCACC 
50 AGCTGYCTCA TTTCCGGCTG GGGCAGMACG TCCAGCCCCC AGTTACGCCT GCCTCACACC 540 
TTGSGATGCG CCAACATCAC CATCATTGAG CACCAGAAGT GTGAGAACGC CTACCCCGGC 600 
AACATCACAG ACACCATGGT GTGTGCCAGC GTGCAGGAAG GGGGCAAGGA CTCCTGCCAG 660 
55 GGTGACTCCG 03GGCCCTCT GGTCTGTAA.C CAGTCTCTTC AAGGCATTAT CTCCTTGGGGC 720 

CAGGATCCGT GTGCGATCAC CCGAAAGCCT GGTGTCTACA CGAAAGTCTG CAAATATGTG 
60 GACTGGATCC AGGAGACGAT GAAGAACAAT TAGACTGGAC CCACCCACCA CAGCCCATCA 



45 



180 
240 
300 



420 
480 



780 
840 
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CCCTCCATTT CCACTTGGTG TTTGGTTCCT GTTCACTCTG TTAATAAGAA ACCCTAAGCC 900 

AAGACCCTCT ACGAACATTC TTTGGGCCTC CTGGACTACA GGAGATGCTG TCACTTAATA 960 

ATCAACCTGG GGTTCGAAAT CAGTGAGACC TGGATTCAAA TTCTGCCTTG AAATATTGTC 1C20 

ACTCTGGGAA TGACAACACC TGGTTTGTTC TCTGTTGTAT CCCCAGCCCC AAAGAC AGCT 1080 

10 CCTGGCCATA TATCAAGGTT TCAATAAATA TTTGCTAAAT GAAAAARAAA AAAAAAAAAA 1140 

ACTCGA 1146 
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40 
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(2) INFORMATION FOR SEQ ID NO: 190: 



(i) SEQUENCE CHARACTERISTICS: 
20 (A) LENGTH: 906 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : double 

(D) TOPOLOGY: linear 

25 (xi) SEQUENCE DESCRIPTION: SEQ ID NO: 190: 

ACTCCCTCAC CCAGGTCCCA GCCCTGGGAA CCACCTACCG TGAGCCCTTT TGCAGATATA 60 

GACTCATTTC ATC CTCAG AT GGTCCTTCAA GGTAGGTACT TTAGTCCCAT TTTAGAGATG 12 0 

AGACGATTGA GGCCAGAGGG GTGNNGTAAC TTGCCTGGGG GCTCACGAGC ACAAAAGGAG 180 

CCGAGGCAGG ATCTGACCCT TGTTCTCTGG CCTCACTGCC CTCACTTTGC CATGACCCGA 240 

35 AGTTATGTCC CTACAAAGCA ATGCATGGTC CAAGGYTCTT TTTATTGTAT TTTTATTTTT 300 

AAGGGTCCTG TTCAAAACTG GTGTGAGCTC TGAGGAGTCC TGAACCCTGG GTGCAGCATC 360 



CTAGCATCCT GGGAGTCCTT TTCTGCCCAC ACTGAGCTGG GCTCCTCGAG GGGTGGGGCT 420 



GCTGTCCCTG GAAGCCTGGC AGCAGCACTG TATCGGGTTG GCTGAAGCTG ARCGCCGTGG 



45 ATGAGGTCTC TCTGATGCCC CAGGCGCAGG ACATGTGTGC GGGTGGAGAA AAGCAGGCCC 
TTTCAGTGCC AGCTCCACTC AATTTCTATG TGGACCAAGA ACGATAAACT TAAAAAATTT 



GGCAGCTGTT ACTTTAAGAG AAAATTCATT AAAAGTCCTC GAGGTATGAA GATGACGGCG 
TGCTTCTCAA TCATTTTGGC ATAACTTGAT TGTGGCTGTA ATTTTTTTTT TTTTTTTTGT 
55 CAAGCATGTC AGACAATAAA GTCTTTGTAA AAAGRGAAAA AAAAAAAAAA AAAAAAAAAA 
ACTCGA 

60 



480 



GGTGCAGGGC TCCMGGAATC CCCGTTTGGC TGAAGGGGTT CCCTGTAQZC MGGGATGTTT 540 



600 
660 



TTTTTCCTAA GGTATCTTCA GAATATGGTG TATTTTTATG TGGAAAAGAA AAGTTATGAA 720 
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(2) INFORMATION FOR SEQ ID NO: 191: 

(i) SEQUENCE CHARACTERISTICS: 
5 (A) LENGTH: 1941 base pairs 

(3) TYPE : nucleic acid 

(C) STRANDEDNESS : double 

(D) TOPOLOGY: linear 

10 (xi) SEQUENCE DESCRIPTION: SEQ ID NO: 191: 

CTTCAGCTGA AGCCCAGGGA CCCCTTTTCC ACCCTGGGCC CCAATGCCGT CCTTTCCCCG 6 0 

CAGAGACTGG TCTTGGAAAC C CTC AGC AAA CTCAGCATCC AGG AC AAC AA TGTGGACCTG 12 0 
ATTCTGGCCA CACCCCCCTT CAGCCGCCTG GAGAAGTTGT ATAGCACTAT GGTGCGCTTC 
CTCAGTGACC GAAAGAACCC GGTGTGCCGG AGATGGCTGT GGTACTGCTG GCCAACCTGG 

20 CTCAGGGGGA CAGCCTGGCA GCTCGTGCCA TTGCAGTGCA GAAGGGCAGT ATCGGCAACC 300 

TCCTGGGCTT CCTAGAGGAC AGCCTTGCCG CCACACAGTT CCAGCAGAGC CAGGCCAGCC 360 

TCCTCCACAT GCAGAACCCA CCCTTTGAGC CAAYTAGTGT GGACATGATG CGGCGGGCTG 420 

CCCGCGCGCT GCTTGCCTTG GCCAAGGTGG AC G AGAACC A CTCAGAGTTT ACTCTGTACG 480 

AATCACGGCT GTTGGACATC TCGGTATCAC CGTTGATGAA CTCAKTGGTT TCACAAGTCA 540 

30 TTTGTGATGT ACTGTTTTTG NATTGGCCAG TCATGACAGC CGTGGGACAC CTCCCCCCCC 600 

CGTGTGTGTG TGCGTGTGTG GAGAACTTAG AAACTGACTG TTGCCCTTTA TTTATGCAAA 660 

ACCACCTCAG AATCCAGTTT ACCCTGTGCT GTCCAGCTTC TCCCTTGGGA AAAAGTCTCT 7 20 

CCTGTTTCTC TCTCCTCCTT CCACCTCCCC TCCCTCCATC ACCTCACGCC TTTCTGTTCC 7 80 
TTGTCCTCAC CTTACTCCCC TCAGGACCCT ACCCCACCCT CTTTGAAAAG ACAAAGCTCT 
40 GCCTACATAG AAGACTTTTT TTATTTTAAC CAAAGTTACT GTTGTTTACA GTGAGTTTGG 
GGAAAAAAAA TAAAATAAAA ATGGCTTTCC CAGTCCTTGC ATCAACGGGA TGCCACATTT 

CATAACTGTT TTTAATGGTA AAAAAAAAAA AAAAAAATAC AAAAAAAAAT TCTGAAGGAC 1020 

AAAAAAGGTG ACTGCTGAAC TGTGTGTGGT TTATTGTTGT ACATTCACAA TCTTGCAGGA 1080 

GCCAAGAAGT TCGCAGTTGT GAACAGACCC TGTTCACTGG AGAGGCCTGT GCAGTAGAGT 1140 

50 GTAGACCCTT TCATGTACTG TACTGTACAC CTGATACTGT AAACATACTG TAATAATAAT 1200 

GTCTCACATG GAAACAGAAA ACGCTGGGTC AGCAGCAAGC TGTAGTTTTT AAAAATGTTT 1260 

TTAGTTAAAC GTTGAGGAGA AAAAAAAAAA AGGCTTTTCC CCCAAAGTAT CATGTGTGAA 1320 

CCTACAACAC CCTGACCTCT TTCTCTCCTC CTTGATTGTA TGAATAACCC TGAGATCACC 1380 

TCTTAGAACT GGTTTTAACC TTTAGCTGCA GC GNCTACGT CNAWCGNTGT GTATATATAT 1440 

60 GACGTKGTAC ATTGCACATA CCCTTGGATC CCCACAGTTK GGTCCTCCTC CCAGCTACCC 1500 



840 
900 
960 
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444 

CTTTATAGTA TGACGAGTTA ACAAGTTGGT GACCTGCACA AAGCGAGACA CAGCTATTTA 1560 

ATCTCTTGCC CAGATATCGC CCCTCTTGGT GCGATGCTGT ACAGGTCTCT GTAAAAAGTC 162 0 

^ CTTCCTGTCT CAGCAGCCAA TCAACTTATA GTTTATTTTT TTCTGGGTTT TTC^TTGTT 1630 

rrGTTITCrr TCTAATCGAG GTGTCAAAAA GTTCTAGGTT CAGTTGAAGT TCTGATGAAG 1740 

10 AAAC AC AATT GAGATTTTTT CAGTCATAAA ATCTGCATAT TTGTATTTCA ACAATGTAGC 18G0 

TAAAACTTGA TGTAAATTGC TGCTTTTTTT GCTTTTTTGG CTTAATGAAT ATCATTTATT 1860 

CAGTATGAAA TGTTTATACT AT ATGTTC C A CGTGTTAAGA ATAAATGTAG ATT AAATCTT 192 0 
GGTAAGAGTT TAAAAAAAAA A 



15 



20 



(2) INFORMATION FOR SEQ ID NO : 192;; 



(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 2118 base pairs 
25 (B) TYPE: nucleic acid 

(C) STRANDEDNESS : double 

(D) TOPOLOGY: linear 
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50 



GAAGTGGAAA GCTGTTTGCA AATAGCAACT CTGGCTAAAG CGAAAATGTT AATCAAGTAG 
AAAGTAAAAT TCAGGATCTT AGAAGCTCAT CCTTCTGATG AGAACTATTT TTTTTTCCGT 
GAAGGAACTA TTATTACTTT AAAAGTGAGG GTAATTTACA TATGGGGTGT ATATATTCTA 



1941 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 192: 

AAATAATAAT AANAATAAAT AAAAATWAAG TGCTTAKTGT AACTCAGCGG ACAGGGCTCC 60 

CAGCTGCTCT GGCACGTGGG ACACCYTCCA CCCTGCACAC AACAGGCATG CAAAGAGGAC 120 

35 TGGATATGGT GGGGTAGAGT GCTTCTGGTG TGTTCACTTT AAGAAAACAT CTGCCAAGAG 180 

AGAAGAGTGC CCAGGAAAGA CCAGGAAAAT ACAAGTACAT G<5CTGCTTCA TACCATATAC 240 

CCCAATTCTT TAAAGCAGCA AAAGGCACTT TTTTTTTCAG GC C AG AGTGA ATCTAAAACA 300 

AACCTGGCTT TGCTTACAGG GAAGCTGTCC CAGAAGGACT GAGTGATGCC TCTTGTTCCC 360 
TAAGGTCTGG AGAGTCTTTG CAAGTTTCCA ACGACATTTC CAACCAGGTG GGAGAGACCA 
45 GCAGTTGACG AGACAAGTCA GACCCAAAAA ACGACGCCAA GGTAGTGAGT GGGTGCCTAT 

TTGGGAGTAG GATGATTTGA GGAAAACAGG AAGAAAAACC GGTCAGAAAG TGGCACTTTG 540 



60 



GAGGAAAAAA AGGCAGGCAA AACTAGTCTA GGTCTAGGCC CTAAAAATGA GCTTCCTTCC 
CACTTGACTG GAAACGCCCA TGTGATTTCT AGGCTGAAAA TAGGTAGGAT TTAACGAGTA 



420 
480 



600 
660 
720 



55 AAAATAGTAA TAAAAGTACC TTTTATAAGC AATGTTGTGT C^CTTGTAGA AGAAAGCAGG 780 
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ACCTAGTTCC CTTCTGTCTC 
CGATCATGCT CCCAGACGAG 
CAGCAGCAGC CCCCTCCTTC 
CATCCCATGT TCCAGTTCAG 
GCCCARGCCC AGCAGGTTGC 
CCAGGGAGGT GGCAGAGAGC 
CAGGCCCCCA GGTCCCCTCT 
AAANACAAAC GGGCAGGGCA 
ATAAAGGAAA TTCCACCCCT 
AAGAGGAAGG TCTTCTCTGG 
TCCAGGGCAC TTTTCACCAC 
ACTGCCAATT TCACAAAGCG 
TCAGGCCGCA GAAGTCCCCG 
TGATGCCAGC TCAAAGTCTT 
AGCGGCCTCC CGAGCAAGTT 
ATCAGTGTCT CATAGGGCAA 
ATGTCACAGT CACAGTCCAG 
TAAATCGGCA TCAGTGGGTG 
AAGTCAGACA GGACATGCCC 
ACCTCAGCAC TCTGCATGAA 
AGATAGTTTA ATATATGC 



TGATTTCTGA TCAGCTGATG 
TCCTTTGGCC TCTTGCTCTC 
TGTGTCCATC TGATGCAGGC 
CTTCTATGGG GTGACTARGA 
AAAAGCAGCT GCAAGCTTCA 
CCATCCAAAA GCCCACTGGG 
GTGTCAGGTA GGCTCTGCTA 
GGGTGGCAGG AATAAAAAAC 
CCCAATCCTT CCATGGAAGG 
CTTTCAGGGA AACAGCTGCA 
AGCCAGTGCA GCCGCTCCAA 
GTTGGTCCTT GGCTTGGTCA 
AANACCGCTG CCGCAGCACC 
TGAAAGTAGA GGCTGCCGTC 
CGGATGGGGG AAACTGAACA 
GTCCTGAGGG ATCTGGGACA 
GACTTCCTGC TCGCGATACA 
GCAGGCCAGG AAGAAGTCAT 
AAACCAGGTG ATGAGCCAGC 
GTCATGGAGC TCTGGATTCA 



GAGCTG<_ .AG 
CATCCCAAGC 
AAGCAGGAGC 
GGTTCCCGGT 
GAAACCCACT 
AGAGGC AT AA 
CTGGCCTCTG 
TCTGGACAGA 
GTGAGACCTT 
GCTGAAACTT 
GTGCCACTGT 
GGACATCTTT 
ATATCAGGCC 
CTCTCAGCTT 
AAAAGGTCTC 
ACAGGTGGTG 
ACACAATCAC 
ATAACCGCAC 
TGAGGGCAAA 
CCTGGTCAAT 



CTGACTCCTT 
AGTAAGAGGG 
AACTAGGGCA 
TCCTCCAACA 
GATTCTGTGC 
A^GTAAAGGC 
AAC 2 CTTTT A 
AATGTGATGT 
AGGGGCCCAT 
CAGCCCCATC 
TGTTCGATCT 
TCTGCTGGGC 
GCTGTTGGGC 
CTSTCTGCTG 
GACCGAGGCC 
GGCTGCAAAG 
GACGTGCCTG 
GATGGTCCCT 
GATGGGCATC 



96C 
1020 
1080 
1140 
1200 
1260 
1320 
1380 
1440 
1500 
1560 
1620 
1680 
1740 
1800 
1860 
1920 
1980 
2040 
2100 
2118 



45 

(2) INFORMATION FOR SEQ ID NO: 193: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 1538 base pairs 
50 (B) TYPE: nucleic acid 

(C) STRANDEDNESS : double 

(D) TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 193: 
55 CCGGGTTCGG CTCTGTGTCA GCAGC CGGGC GGCGCTCGGG CGGGACATGG CAGCCTGTAC 
AGCCCGGCGG CCTGGCCGTG GGCAGCCGCT GGTGGTCCCG GTCGCTGACT GNGGCCCGGT 
60 GGCCAAGGCC GCTCTGTGCG CGGCCGNAGC TGGAGCGTTC TCGCCAGCGT CGACCACGAC 



WO 98/54963 



T/US98/1I422 



446 



GACGCGGAGG CACCTCTCGT CCCGAAACCG ACCAGAGGGC AAAGTGTTGG AGACAGTTGG 24 C 

TGTGTTTGAG GTGCCAAAAC AGAATGGAAA ATATGAGACC GGGCAGCTTT TCCTTCATAG 3 CO 

^ CATTTTTGGC TACCGAGGTG TCGTCCTGTT TCCCTGGCAG GCCAGACTGT RTGACGGGGA 360 
TGTGGCTTCT GCAGCTCCAG AAAAAGCAGA GAACCCTGCT GGCGATGGCT CC AAGGAGGT 420 

10 GAAAGGCAAA AGTCACAGTT ACTATCAGGT GCTGATTGAT GCTCGTGACT GCCCACATAT 430 
ATCTCAGAGA TCTCAGACAG AAGCTGTGAC GTTCTTGGCT AAC CATGATG ACAGTCGGGC 540 
CCTCTATGCC ATCCCAGGCT TGGACTATGT CAGCC ATGAA GACATCCTCC CCTAGACCTC 
CACTGATCAG GTTCCCATCC AAC ATGAACT CTTTGAAAGA TTTCTTCTGT ATGACCAGAC 
AAAAGCACCT CCTTTTGTGG CTCGGGAGAC GCTAAGGGCC TGGCAAGAGA AGAATCACCC 

20 CTGGCTGGAG CTCTCCGATG TTCATCGGGA AACAACTGAG AACATACGTG TCACTGTCAT 
CCCCTITCTAC ATGGGCATGA GGGAAGCCCA GAATTCCCAC GTGTACTGGT GGCGCTACTG 
TATCCGTTTG GAG AAC CTTG ACAGTGATGT GGTACAGCTC CGGGAGCGGC ACTGGAGGAT 

25 

ATTCAGTCTC TCTGGCACCT TGGAGACAGT GCGAGGCCGA GGGGTAGTGG GCAGGGAACC 
AGTGTTATCC AAGGAGCAGC CTGCGTTCCA GTATAGCAGC CACGTCTCGC TGCAGGCTTC 1020 
30 CAGTGGGCAC ATGTGGGGCA CGTTCCGCTT TGAAAGACCT GATGGCTCCC ACTTTGATGT 1080 
TCGGATTCCT CCCTTCTCCC TGGAAAGCAA TAAAGATGAG AAGACACCAC CCTCAGGCCT 1140 
TCACTGGTAG GCCAGCTGAG GCCCCAAGTG CCCAGGCTTG GTCACCGGGA AGAACAACTC 1200 
TCATCCCACA ATTGCTGCAG AACTCTTCTC TCCCCATCAT GGGCCACAGT GGGTCTCTTA 1260 
ATTTGATTGT GGGGTTCTTT TTGTGGGGAG GGGTGGTATA ACTTTTCTTC AGAAGACCCA 1320 
40 TGTGGGACAC CTCCAAGGCT GGCCTCCTCA TAAGCCCTGC CTACACCATG TTCCAGTAAA 1380 
CCTCTCCACC AAGGAACTGT GTTCAGCTGC CACAGGCCTG GAGGAGTTTC CTGGCCTGTC 1440 
ACGTGAGGTT TGATCAGTAA ACCAGTGCAS GYTTGGCCAA AAAAAAAAAA AAAAAAAAAA 1500 
AAAAAAAAAA AAAAAAAAAA AAAAAAAAAA AAACTCGA 



35 



45 



50 



(2) INFORMATION FOR SEQ ID NO: 194: 



(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 1098 base pairs 
55 (B) TYPE: nucleic acid 

(C) STRANDEDMESS : double 

(D) TOPOLOGY: linear 



60 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 194: 



600 
660 
720 
780 
840 
900 
960 



1538 



WO 98/54963 



T/US98/I1422 



TGGTTCATCC TTGTCCAAAT GCAGAGTCAG AGCTATTTGT ACTTCATTAT TATTTCCAAG 
35 GCGAATAGTT GGCTTTCTTT TTGCAAAAAT AATTAAAGTT TTTGTATGTT GCAAAAAAAA 
AAAAAAAAAA CTACGTAG 



40 



55 



(2) INFORMATION FOR SEQ ID NO: 195: 



(i) SEQUENCE CHARACTERISTICS : 
45 (A) LENGTH: 1001 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : double 

(D) TOPOLOGY: linear 

50 (xi) SEQUENCE DESCRIPTION: SEQ ID NO: 195: 

GAATTCGGCA CGAGATAGCT TGCATCTCAT CCCAGTAAAA CCACTTATTT ATAACATATC 
AACGTATTGA CAAGGTTGAA GAGCAAGATT GTTCTGAGGT GAGATGCAAA TTTCAAAGGG 
GTGAGCACTA ATTGTTCCAG TGATTGTTTA TTTATTGGCT AGGACATAAT TACTCTCTTT 



240 
300 



447 

AGACCCTGTC TCAAATAATA ATAATAATAA TAATCTTATT TTGGAGAVTA AAGA3ACC T S 60 

TGGATTTGAG GTOZCATTTG GGTAGAAAGA AAAGACGTTT ACACC GAC-AA ATAGTCTGTG 120 

5 TTGCCCTGAA GGAGCAGAGG GATGCATCGC TGGAGGTGAC CTACA 3TTGA AGAAGACTCA 130 
TTATGACAGA CCTTGTCC~T CTTCCTTGTG GAAAGTGTTT CCTC73CTGC TACTGCTCAT 
GAGACTCTTC CCCCTCCCTG TCCCAGGGAA CCA^AGGGCT TTNCTACCAC ACCCTTT-™ 

NGCCCCCCGC CTCCCATGTC TGCTGTGCCT TTGTACTCAG CAATTCTTNG TTTGCTCCCA 360 

TTATCTTCCA GCCGGATACA GAGTGAATAG TTAACCACAC TTAGGTCAAA TAGGATCTAA 42 0 

15 ATTTTTGTTC CTGCTCCNGT GTAAAGAGGC CAGTGTTTGT GTGTTGCAAG CAGCCTTGGA 480 

ATAGTAACTC TTCTCATTTG TTTGGGATCT GGCCAMCAAG TTCCAGAATG ATACACGGAT 540 

CAGTGCAGAA GTTCATCAGG CTCTCGGACC TTAGGGCTGT TGGAGAAGGC TTCAGCAGCA 600 

20 

GAACTGATGG TKAWKGYTCG TGTTCTCCAT CCTCAACTTT CTTTGCTTCG ATCATACACA 660 

AGAATACATT TGGAAGGGCA AAAAATGAAC ACTGTTGTTC ATTGCAGCCG TGTTTTGTGA 720 

25 CACAGATGCA CAGTCTGCTG TGAAGACCTT CTCTCAAGTG GSATYTGGGA GTCCATGCCA 780 

GATCATGGTG CTTCATGAGA GAC TG ACAGC TATCAGGGGT TGTGGCACTT AGTGAGGACT 840 

CTCCTCCCCC AGTGTGTGCT GATGACACAT ACACACCTGA CAATAGCTTG AGTCTTCTCT 900 

30 

GTTCCTTTTA CTCTGTAGCC AACATACACA TGATTTAAAA CCCTTTCTA^ ATATCTATCA 960 



1020 
1080 
1098 



60 
120 
130 



GAGGTTACAC ATCTGCCTCC AGGTTCCTGT GTGCTTGTGC CCTTGGGATC AGGCCAGGGC 240 



60 AGACTGTGAT CACTGAGATT CAAACTCCCA GARTAATCAG GAAGAGCTTT CTAGAGACCA 



300 



WO 98/54963 




T/DS 98/1 1422 



AGGCCAGGCC TGATCCCTGA GGGATGCATG AGAAGGCTTG GAATCTCATT CTGCTATGGT 36 0 

GGCTCTCTCT TGATCTTCTT GGAGTAGCAA AAACAGCAAT GTGGGCCCAA TGGTGTGGCC 420 

^ TAAATGATCA CAAAGGTAAA TGAG7AAAGG GCT CAGCAG A TGAGTAAGGA GCCTTGTCCT 480 

GAGAAATTAG CACTGGGCTC TGCATTCAGA AACATGTGAT AAGCATTGCC CATTGCACAT 54 0 

10 TGCCTTTATT GTGTAAGGAC ATGAAATTCC AGTTTTGCAT AGCTAGTGAT GAATACCTGA 60 0 

AGGGAATTGC AGACATATTT TATTTTATTT TTAATTGACA GATGGAATTG TATATATTTA 660 

TCATGTACAT AATCATGCTT TAAAATATGT ACATTATGGA ATGGCTAAAT CAAACTAACC 72 0 

TAGGCATTAT CTCATATAAT TGTCATTTTT GTGGCGAGAA GACTAAAAAT CTACCCTTTC 780 

AGCATTTTTA AAGAATACAA TGTGTTTTAT TAACAACAGT CACCATTTGG TACACTAGAT 84 0 

20 CTCTTGAACT TCTTCCTCTT ATCTAACTGA GATCTTGTAA CCTTTGATAA CAGCTCCCAA 900 



15 



25 



30 



40 



50 



(2) INFORMATION FOR SEQ ID NO: 196: 



960 



GCCCTTCCCC AACCACTGCT CCACCCGTGG TAACCACCAT TCTATTCTCA ACTTCCTGGT 
AATCACCATT CTAGACACAG GGAAGACTCT CTACCCTCTG A 1001 



(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 1443 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: double 
35 (D) TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 196: 

ATAAACTGAA ATAGGTCATG CAAATATAAA ATATTATTTT TAAATTATTT GTCATAAGAA 60 

ACGATGGTGG CCATATTTTG CTTTAATAAT GGAAAAAATG TGGTTAGCAT TCTKTGGAAG 120 

GTGGTCATCA GATAGTAGAC ATTTTCTAGG ATTTATTTCT ACCTGCATAT GTGGAAATGT 180 

45 GTACTACTTT AGATTTATWT AATGGCAGCT AACTCAGAGG CATCAAAATG TGCTAATGGT 

GTAATATGGC CTTTCTCTTG CTGTYCTGTT TTGTARGCCT TCAATCAAGC ARGGGCAGGG 

CCGTACAGTG AACTTGTCCT TTGSCAGACG CCAGCGTCTG CCCCTGACCC CGTCTCCACT 360 

CTCTGTGTCC TGGAGGAGGA GCCCCTTGAT GCYTACCCTG ATTCACCTTC TGCGTGCCTT 420 

GTACTGAACT GGGAAGAGCC GTGCAATAAC GGATCTGAAA TCCTTGCTTA CACCATTGAT 480 

55 CTAGGAGACA CTAGCATTAC CGTGGGCAAC ACCACCATGC ATGTTATGAA AGATCTCCTT 540 

CCAGAAACCA CCTACCGGTG AGTGCAAGGG AGTAGAAATC TGCATCAGCA CATCAGCACT 600 

TGGGGATCTA AGTAAACCTC TCGGGGAAAA TGACCAAGTG GATGTCATCT CCCAGCTGTT 660 

60 



240 

300 



WO 98/54963 



449 



T7US98/11422 



10 



20 



AAA 



30 



(2) INFORMATION FOR SEQ ID NO: 197: 



(i) SEQUENCE CHARACTERISTICS: 
35 (A) LENGTH : 1282 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : dcuble 
{ D) TOPOLOGY : linear 

40 (xi) SEQUENCE DESCRIPTION: SEQ 13 NO: 197: 

GAAAAAAAAA AGTATGACCC AGTAGCTAGG C^CGGTGGC CCCGCCAAGT TGACA.CATAA 

AATTAACTGT CACAGTATCA TCTTAGAAGT GAAAG-_-_C-CC CCTTTATCCT C-CAGTGCCCC 



45 



TGCTATGGCT TGTATGTGTC CCCTCAAATT CAAGGGCGGC CAATGTGACA GCACCAAGAG 
50 GTGGGGTCTT TAAGAGATCA CTAGGCCATG AGGG-.TTCTC TTAGGACTGG GACGAAGGCC 
CATAATAAAA GAGGTTTCAG GGAGCATCCT GCTAGCTTGC CTTCTG7ATG CGAGAACACA 
GCAAGAAAGC CCTAGTCAAC AAGTGCCAGC TCCTTGAJTCT TAGACTTCCC AGCC7CCAGA 



55 



CAGCACAAAA TGAAGATACC ATACCTGAAC ACCTGAACAT 7CTTCACAAG GT AGT AAATG 
60 CACTGCTTTA TTCTGGTCTC AGTATTGTGT GCT7A-TAAG GAAATGAGAA AG-GGTGGATC 



TCTAAGAGCC CAGATGTCCA GAGTA7TGGC GCAGGTCGAT CCCTCAGG77 AGAACACCTG '_j 

TGAAAAAGCC ACACTGGTTC AGGGACTGA 0 GGGACGG7TT TGTGTCG-.7T Y7AAC.77GCA 730 

CCGTCTCTAC CCCAGAGTGG ACTCARATC7 CC-AAGTCA7C ■GTGTGA.-_C.--T GC7_^G7G-,GA 34 0 
AATT AT AAAA GGGCTTTGGC AATATGTTAG CGCA-.G.--ATT GGGCTCCTCC C^GAAAGTG7 

GCTGACNTTA AGAGTGGCTT AAATGATG7-T AAAA.C. V- - A AGATTTCTAA AAGGPG Z-GCA 95 3 

TTGGAGATAC GTTGACTTTT ATTAAACMAG CTA7AGT7GT TTAATGAYTT CT7A-AA_AAA7 102 0 

ATCTGGAGCT CAGGGGTTCA ACTGAGGGAA CACA7GG7GA GKATCATCGT GGAGT.-_-.TTA 108 C 

15 AATGCCAGGT AAC CCGTTGA AATTATCAAA AACAG CTC GC ACOT AG GAGA AAG-CAC GTC-. 11 4. 
GAGGATAGTT CTGTTATGGA GAAGATGAAA GGGTTGAGTA GTGTAGGAAC 7A-CC-AAAGG 
TGAGCTTAGA TTTGGATAGT AAAACCTCAA GACCGGA7GT AAAAAGTATT TGA7GAA.TGC 

AGCATAAATA ATTTAATTCA GTGTTAANAT GC GAA GC-CT A GTATATTGAG CGCAATGTGA 1320 

AAAGAAACTC ACATTGGGAG AATGCCACGT TTTCCTTATA AGATAGCTTT a-A.GA.TACC\ 1330 

25 TTTTAGACAG ATGGAAATTG AATAGCTTTA CLAAAA GGGAA ATGTTTGA7C TTGGGGAAAA 1440 

1443 



1200 
12-50 



60 
120 



TCTACCACCA CCTACTGAGA AAGAACATGG TGCGA7C7GG CATGGGAGAA A.GCTCAGTT 130 



240 
300 
360 
420 



ACTGTGAGAA ATACATTTCT G7TTCCTTACA AATTACCCAG GGTGGTGTAT .GGGTTATAG 480 



540 
600 



WO 98/54963 




TUS98/I1422 



450 



730 
340 



AGGGCATAGG ATGAACAAGT TACTGCTAGA CCTCTCACAA TC-CCA^.AAT - AAGA i . 

GTATTTTCAT CATTNCTTG7 CTCTTCGGAA GCTAACACCA TGCTA _ A.-. . A C-C- 
AGATGTCTAA AAACACCTTA AGTATTTGTC T AGAAA GG7G G7GCATTG7C I70AAAGAAC 
CAAAATTCMA A ATAATTTCA AAGGG-GCTAA AGCACT A>TT AA7CXAA.A7T 7-A77AG7T7T 

TAATGGTACT ACCACTCTGA AATTT AAAAT GTCATC77AC G7TCC7CTTC GTCGGATTGC- =>CC 

ATTTATTGCT AAAAC CTGGT AAACACTTTA ATCCYTTTCA A7TCCA7TA7 CAGGG-GTCTT 963 

GTCGAGAATT ACTCGCAGAC TAATAGTCAC C7GACT7TTC 7CCCTC-CA7C GGGA.7TTGC7 1020 

GTCTAATTGT GGTTACAAAT AAGTAACTGC CAAACTAA7C TTTCTAAAAA GGAAGACTGA. 10 3 0 

TCTCGTCACT CCTTTGCTCA ACAATGTAAA AC-GTCC CATT GTCTCCGAAA 7AAAACCAC-C 1140 

TTTCCACTGT GTATACAATA CATCCATGAT CTGTATCCA3 GATCATTTTTG 7A.7T! IGCTCA 1200 

CTTTATACAC CACCCCCCAT GCCACATCAA ATTAAA77AT CCTGA7AAA? C-C AAGT GCAA 1260 

AAAAAAAAAA AAAAAAACTC GA 1282 

(2) INFORMATION FOR SEQ ID NO: 198: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 951 base pairs 

(B) TYPE : nucleic acid 

(C) STRAND EDNES S : ccuble 

(D) TOPOLOGY: linear 

(Xi) SEQUENCE DESCRIPTION: SEQ ID 1*3: 198: 
ATTTCGGAAC GAGGACTGAA GTGGGAGCGG CGGCAGGGTA GAAGA GAGA.-. GGGGGATCTA 60 
TGTGGTAACT AAAGAATGTT TCTGTTTTGT TAATTA7TOT GTGTGTGTGG 7TTTATTGTT 120 
TGCTTAAGAG AATCAAAAAC TGAAAAAAAT GAjGAATACAG GAAATC-GC7C 77G7TTATTT 
TTTTGCTGTG TTTACAGCTT GTTAATGCTC TACTGTCTGT GTTTCAAGA.G AGA-GGTGTTC 
ACTGCCCAGC TCGTTTTGTG TCCTGAGCCC TATGCCCAGC CCACCTTA7A AA7CATGCCT 
GTTTAGATGT TTGATTTTGT TCTGTTTGCT ATTGTTATCT TAAAJGGTGTA 7AAC7C TGAC 
ATGCCAGACA TCAAATTAAG CTCAAATTAA GCTCTCGTTT AAATGTTTAA ACACCTAATT 
TATATTCTAA TTGATCGCAG CCACTGATGC ATGTAGTTTA GCTACT^GGG CTAAA7AAGC 
ATATT AATTT TCCACATCAG GCCATCAGAT CTTGAGAACC AACAGT7A7C 7AGAATTCCG 
TGTCTACTAA TGTTTCACGT GCATGCAGCC TTCA77AA7T 77GTAGCAAA A7A7AAAGTG 
ATCATTATGT AGTTTCTGGA TTAAAAAAAT TTGTG7GTGA ACTTC-CTTTG 7AAAGTGCAT 



180 
240 

300 
360 
420 
430 
540 
600 
660 



WO 98/54963 LS98 1 1422 



0 1 



451 



GTGGAATTAA TGGGACAGTG TGCCCTTTGT C3TTASATGTT AGACCAAAAG AAAGGGCTTA 720 

TAGTGTTAGT ATTGGAGCAC TTTGAAGATA GATATTTTCA GAAAAGATGT AGGATTTAAA "7 30 

5 AGTTAAATTT TAAATTTTAG AAAAAGATAT GATGGCAATT GGAAATAGTC ACAATGAAGT 84 0 

TCTTCATCCA GTAGGTGTTT AACAGTGTTA TTTTGCCACT GGTAATGTGT AAACTGTGAG 9CC 

TGATTTACAA TAAATGATTA TGAATTCAAA AAAAAAAAAA AAAAAACTCG A 931 

10 



35 



45 



55 



(2) INFORMATION FOR SEQ ID NO: 199: 

15 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 1740 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : double 
20 (D) TOPOLOGY : linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO : 199: 

TTATTATAAT AATGATGATG ATTCCAAGGA AAAAACCTAC AGCGAATGTT CCATTTCTAC 60 

25 

CCCGCACGCA GACACTCTCC CTAACACTGA TAACCTGAGC CCCCAGCACT GGACGGAAGA 120 

ATGCTGGCGT CTCCGTGTGT ACTGGTTCAG C^TTCTGGCC CCAGCCTTGT CAGGACCCCC 130 

30 TGGTGTCCAG AGCCCCCACC CCTCCCGCAA CAAGCAGCTG ATGCCCCAGT GATTCTCTAT 240 

ACATTTTTCA CCTCGGCCAA TATGTCCAGG AAAACTGCTT ACTTCTCTTT TCTTGCCTGG 300 

AGCCTTCATT GTTCACCCTT ACGTTGCAAT ATAGGAATTA ATGCTACAAA ATAAAAGTAA 360 

AGCTTACCTG AAAAGTGCAT AGTTTGGGGC AATGGTATCT ACATCTCCCA CTGTGGGAAA 420 

ACCAGCAAAG CATCAAAACT CTCAATTCTC CTGTT ACCRA ATGCAGATCT GAATT AT AAG 480 

40 ATGTTTATGT TTGACCATTG TTTCAACAAT GGGATTTTGT TACGAATTAT CCCTTTAACT 54 0 
GAAACCCTCA GTTTTACTGT TTACATTATT AGGAAAAC AG GGATATCTTT TGAATCTAAA 
AATTTGATGT ACAGCATGTG ATTTTTGAAG TTTACATGTA AAGTCACAGT ATAGGTGAAA 
TAACGTTTGT CATATTTTGA GACGTATCCT GCAGCCATGT TTTTACGTGA GTGTTTTAGT 
CAAAGTACAT GGTAGACAGT CTTTCACAAT AAAAGGAAAA GGATTTTTTT TCCTCCAAAT 

50 GTACATTTAT CAACCTAATG ATTGATTTTT TTAAAAAGAG ATTTCGCCCC AGTCTGGTTT 840 

ATGAAAGTTC ATTGCCCTAA ACTGTGCTGA TTGTTTTTAA TCAAGTTATA AATTTCCAAC 900 

CTAGATCATO TATCTACCAA CTCTCCTGCA TTTTCCAAAA GGCATTGAGC TTAAATATTA 960 

GTCTTGCTTA GAGTAGGTTA TCCACTTACA TGCTGCGCTA AAGCCATGCC TTTGAAACTC 1C2C 

CTTGTTTAAA ACATGATATG ATCTTTGTGG GCAGTTTCAG AAAAGAAAAC AAACAAACAA 1080 

60 AAATCGACCC TTTA^TTATT ACTTGCAACT CAACAGATCT CCCTGCCGTA CVGCC7Tr?C 1140 



600 
660 
720 
780 



WO 98/54963 ^B'T TS^H 11422 



452 



40 



50 



60 



GGTCAAATGT GGAACCATAT AAGAGGACCA CCATATGCCC ATAAGAATCC CCACACGGGA 



CAGGAACTTT ACTTCAGGGC TGTC CAGATT GCAGTTGTGC CCCGTGTATG TG3ATC7AGT 1200 

TCACAGAGTC TTTGGAAGCC AGCAGTCGTG CGGTCGGTAT ACTGTCCACT GATTTTA7GT 126 0 

5 

AGATTTGGTA TCCTCAGCAG C CAGTGTT AA CACCACTGTC ACGTAGTTAN CAGATTCATC 132 0 

TTTTATGTAT TTAAAGTAAT CCATACTATG ATTTGGTTTT TCCCTGCACC ATTAATTCTG 13 8 0 

10 GCATCAGATC AGTTTTTGTG TTGTGAAGTT CTACTGTGGT TTGACCCAAG ACCACAACCA 1440 

TGAGACCCTG AAGTAAAGAT AAGGTACACA TACATTATTT GAGTAACTGT TTCCTTGGGG 15C0 

GCCAATCTGT GTATGCTTTT AGAAG7TTAC AGAATGCTTT TATTTTTGTC TATAACAAAC 156 0 

15 

AGTCTGTCAT TTATTTCTGT TGAT AAACC A TTTGGACAGA GTGAGGAGGT TTGCCCTGTT 1620 

ATCTCCTAGT GCTAACAATA CACTCCAGTC ATGAGCCGGG CTTTAGAAAT AAAGCACTTT 1680 

20 TGATGACTCA MAAAAAAAAA AAAAAAAAMG VCGGGGGGGG GCCGGTAACC CATTTNNCCC 1740 



25 (2) INFORMATION FOR SEQ ID NO: 200: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 1707 base pairs 

(B) TYPE: nucleic acid 
30 (C) STRANDEDNESS: double 

(D) TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 200: 

35 CCTTATAGAA GGGAGAGGAG CGAACATGGC AGCGCGTTGG CGGTTTTGGT GTGTCTCTGT 60 

GACCATGGTG GTGGCGCTGC TCATCGTTTG CGACGTTCCC TCAGCCTCTG CCCAAAGAAA 120 

GAAGGAGATG GTGTTATCTG AAAAGGTTAG TCAGCTGATG GAATGGACTA ACAAAAGACC 180 

TGTAATAAGA ATGAATGGAG ACAAGTTCCG TCGCCTTGTG AAAGCCCCAC CGAGAAATTA 240 

CTCCGTTATC GTCATGTTCA CTGCTCTCCA ACTGCATAGA CAGTGTGTCG TTTGCAAGCA 300 

45 AGCTGATGAA GAATTCCAGA TCCTGGCAAA CTCCTGGCGA TACTCCAGTG CATTCACCAA 360 

CAGGATATTT TTTGCCATGG TGGATTTTGA TGAAGGCTCT GATGTATTTC AGATGCTAAA 420 

CATGAATTCA GCTCCAACTT TCATCAACTT TCCTGCAAAA GGGAAACCCA AACGGGGTGA 480 

TACATATGAG TTACAGGTGC GGGGTTTTTC AGCTGAGCAG ATTGCCCGGT GGATCGCCGA 540 
CAGAACTGAT GTCAATATTA GAGTGATTAG ACCCCCAAAT TATGCTGGTC CCCTTATGTT 
55 GGGATTGCTT TTGGCTGTTA TTGGTGGACT TGTGTATCTT CGAAGAGTAA TATGGAATTT 



600 
660 



CTCTTTAATA AAACTGGATG GGCTTTTGCA GCTTTGTGTT 7TGTGCTTGC TATGACATCT 72 0 



730 



WO 98/54963 



T/US98/11422 



30 



35 



(2) INFORMATION FOR SEQ ID NO: 201: 



(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 779 base pairs 
40 (B) TYPE: nucleic acid 

(C) STRANDEDNESS : double 

(D) TOPOLOGY: linear 



45 



55 



840 

9CC 



CATGTGAATT ATATCCATGG AAGCAGTCAA GCCCAGTTTG TAGCTGAAAC ACACATTGTT 
CTTCTGTTTA ATGGTGGAGT TACCTTAGGA ATGGTGCITT TATGTGAAGC TGCTACCTCT 
5 GACATGGATA TTGGAAAGCG AAAGATAATC TGTGTGGCTG GTATTGGACT TGTTGTATTA 963 
TTCTTCAGTT GGATGCTCTC TATTTTTAGA TCTAAATATC ATGGCTACCC ATACAGCTTT 1C2 0 

CTGATGAGTT AAAAAGGTCC CAGAGATATA TAGACACTGG AGTACTGGAA ATTGAAAAAC 

10 _ 

GAAAATCGTG TGTGTTTGAA AAGAAGAATG CAACTTGTAT ATTTTGTATT ACCT^--^ 
TTC AAGTG AT TTAAATAGTT AATCATTTAA CCAAAGAAGA TGTGTAGTGC CTTAACAAGC 
15 AATCCTCTGT C AAAATCTG A GGTATTTGAA AATAATTATC CTCTTAACCT TCTCTTCCCA 
GTGAACTTTA TGGAACATTT AATTTAGTAC AATTAAGTAT ATTATAAAAA TTGTAAAACT 
ACTACTTTGT TTTAGTTAGA ACAAAGCTCA AAACTACTIT AGTTAACTTG GTCATCTGAT 

20 

TTTATATTGC C TT ATC CAAA GATGGGGAAA GTAAGTCCTG AC C AGGTGTT CCCACATATG 
CCTGTTACAG ATAACTACAT TAGGAATTCA TTCTTAGCTT CTTCATCTTT GTGTGGATGT 
25 GTATACTTTA CGCATCTTTC C TTTTG AGT A GAGAAATTAT GTGTGTCATG TGGTCTTCTG 1560 
AAAATGGAAC ACCATTCTTC AGAGCACACG TCTAGCCCTC AGCAAGACAG TTGTTTCTCC 1620 
TCCTCCTTGC ATATTTCCTA CTGAAATACA GTGCTGTCTA TGATTGTTTT TGTTTTGTTG 
TTTTTTYGAG ATCACGYTAC TGGGCTC 



1080 
1140 
1200 
1260 
1320 
1380 
1440 
1500 



1680 
1707 



60 
120 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 201: 
CTGTCCCCAG TGTTTCCAGG TAATGACTTG GCACTCCAGA GAAAGTTTCA TRCTGTTGCG 
TGTGGTGGCT CC^AGCCAAG CACCTGGCAT GCAGGTCAGC CCTTCCCAGC GGGCGTGGCG 
50 TCGTCCTCTT CAGAGATGCC ACGTTGCAGC CCCAAGGCCT CACCATTTTG CGTTTTTTAG 18C 
AAACCCATTT TCrTGGTCAT TTATAAAGCT GCTTTATAGA TATCTTTGAT CCTGGCATGC 240 
CTTGGTTTCC TCTCCCTTCC CTCTTTCCAA TCCTGGTTTC CTAACCTCCT CTTGTAGTAA 
TTCTCAACTC AACTCAAAGT CCCAAGAATT TGGAATGGTA GGATGCTGTG CGGGGAGCTC 
GAGGCTGAGG CATAATCACT GCTTCGGTTC TGCTCATCAG GGGACACGCT CCCTTACTCA 
60 TGGCAGCCAT GTTTGATTGT CACAGAGCCC CCCGAATACT CTGTCTATAG TGACACACTG 



300 
360 
420 
480 



5 



WO 98/54963 VS98/ 1 1 422 



1 



TAGGTGTCAT aaattttaag aaacctgctt TTAAGTACTA TTTATAGGTT TTTCTGTTAT 54 0 



600 
660 



ACTTGCAACC TAGTTTTAAA ATACATGAGG ATTTTATGAA AGCTTTATAC AGACATTTAT 
AGGAAACTCA TTC TTTGATT TTAGGTSCCA TTTAAATTGA TAACACTTAC TTTATAAAAA 
GATGCTTTTT GTCTGGATAG AGCCTTATAG TTTAAAATAT CTTCA7ATAT TGCCATTTGA 72 0 

10 TCAAATAAAT TTCTTACTTA GAAAAAAAAA AAAAAAAAAA AAAAAAAAAA AAAACTCGA 77 9 



30 



40 



50 



15 (2) INFORMATION FOR SEQ ID NO: 202: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 1617 base pairs 

(B) TYPE: nucleic acid 
20 (C) STRANDEDNESS : double 

(D) TOPO"LOGY: linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 202: 
25 GGCACAGCTT TCTGTCTCTT CCTCGCTCCC TCTCTTTCTC TCCTCCCTCT GCCTTCCCAG 60 
TGCATAAAGT CTCTGTCGCT CCCGGAACTT GTTGGCAATG CCTATTTTTT GGCTTTCC CC 120 
CGCGTTCTCT AAACTAACTA TTTAAAGGTC TGCGGTCGCA AATGGTTTGA CTAAACGTAG 180 
GATGGGACTT AAGTTGAACG GCAGATATAT TTCACTGATC CTCGCGGTGC AAATAGCGTA 240 
TCTGGTGCAG GCCGTGAGAG CAGCGGGCAA GTGCGATGCG GTCTTCAAGG GCTTTTCGGA 3 00 

35 CTGTTTGCTC AAGCTGGGCG ACACATGGCC AACTACCCGC AGCCTGGGAC GACAAGACGA 360 
ACATCAAGAC CGTGTGCACA TACTGGGAGG ATTTCCACAG CTGCACGGTC ACAGCCCTTA 420 
CGGATTGCCA GGAAGGGGCG AAAGATATGT GGGATAAACT GAGAAAAGAA TCCAAAAACC 480 
TCAACATCCA AGGCAGCTTA TTCGAACTCT GCGGCAGCGG CAACGGGGCG GCGGGGTCCC 540 
TGCTCCCGGC GTTCCCGGTG CTCCTGGTGT CTCTCTCGGC AGCTTTAGCG ACCTGGCT1T 
45 CCTTCTGAGC GTGGGGCCAG CTCCCCCCGC GCGCCCACCC ACACTCACTC CATGCTCCCG 
GAAATCGAGA GGAAGATCCA TTACTTCTTT GGGGACGTTG TGATTCTCTG TGATGCTGAA 
AACACTCATA TAGGATTGTG GGAAATCCTG ATTCTCTTTT TTATTTCGTT TGATTTCTTG 780 
TGTTTTATTT GCCAAATGTT ACCAATCAGT GAGCAAGCAA GCACAGCCAA AATCGGACCT 840 
CAGCTTTAGT CCC^CTTCAC ACACAAATAA GAAAACGGCA AACCCACCCC ATTTTTTAAT 
55 TTTATTATTA TTAATTTTTT TTGTTGGCAA AAGAATCTCA GGAACGGCCC TGGGCACCTA 

CTATATTAAT CATGCTAGTA ACATGAAAAA TGATGGGCTC CTCCTAATAG GAAGGCGAGG 1C20 
AGAGGAGAAG GCCAGGGGAA TGAATTCAAG AGAGATGTCC ACGGACGAAA CAT AC GGTGA 1030 

60 



600 
660 
720 



900 
960 
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455 

A7AATTC AC 3 CTCACGTCGT TCTTCCACA3 TATCTTGTTT TGATCATTTC CACTGCACAT 1-4 3 

TTCTCCTCAA GAAAAGC GAA AGGACAGACT GTTGGCTTTG TCCTTGGAGG ATAGGAGGGA 12 00 

GAGAGGGAAG GGGCTGAGGA AATCTCTGGG GTAAGAGTAA AGGCTTCCAG AAGACATGCT 1260 

GCTATGGTCA CTGAGGGGTT AGCTTTATCT GCTGTTGTTG ATGCATCCGT CCAAGTTCAC 1320 

TGCCTTTATT TTCCCTCCTC CCTCTTGTTT TAGCTGTTAC ACACACAGTA AT AC CTGAAT 13 80 

ATCCAACGGT ATAGATCACA AGGGGGGGAT GTTAAATGTT AATCTAAAAT ATAGCTAAAA 1440 

AAAGATTTTG ACATAAAAGA GCCTTGATTT TAAAAAAAAA AGAGAGAGAG ATGTAATTTA 1500 

15 AAAAGTTTAT TATAAATTAA ATTCAGCAAA AAAAGATTTG CTACAAAGTA TAGAGAAGTA 1560 

TAAAATAAAA GTTATTGTTT GAAAAAAAAA AAAAAAAAAW CTCGACCGCA AGGGAAT 1617 



10 



20 



35 



45 



(2) INFORMATION FOR SEQ ID NO: 203: 



TGTGTGACTC CTGGTTTCTG CATCTGCCCA CCTGGATTCT ATGGAGTGAA CTGTGACAAA 



(i) SEQUENCE CHARACTERISTICS: 
25 (A) LENGTH : 1974 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : double 

(D) TOPOLOGY: linear 

30 (xi) SEQUENCE DESCRIPTION: SEQ ID NO: 203: 

GAATTCGGCA CGAGGCTGAG GGAGCTGCAG CGCAGCAGAG TATCTG AC GG CGCCAGGTTG 60 

CGTAGGTGCG GCACGAGGAG TTTTCCCGGC AGCGAGGAGG TCCTGAGCAG CATGGCCCGG 120 

AGGAGCGCCT TCCCTGCCGC CGCGCTCTGG CTCTGGAGCA TCCTCCTGTG CCTGCTGGCA 180 

CTGCGGGCGG AGGCCGGGCC GCCGCAGGAG GAGAGCCTGT ACCTATGGAT CGATGCTCAC 240 

40 CAGGCAAGAG TACTCATAGG ATTTGAAGAA GATATCCTGA TTGTTTCAGA GGGGAAAATG 300 

GCACCTTTTA CACATGATTT CAGAAAAGCG CAACAGAGAA TGCCAGCTAT TCCTGTCAAT 360 

ATCCATTCCA TGAATTTTAC CTGGCAAGCT GCAGGGCAGG CAGAATACTT CTATGAATTC 420 

CTGTCCTTGC GCTCCCTGGA TAAAGGCATC ATGGCAGATC CAACCGTCAA TGTCCCTCTG 480 

CTGGGAACAG TTCCTCACAA GGCATCAGTT GTTCAAGTTG GTTTCCCATG TCTTGGAAAA 540 
50 GAGGATGGGG TGGCAGCATT TGAAGTGGAT GTGATTGTTA TGAATTCTGA AGGCAACACC 
ATTCTCCAAA CACCTCAAAA TGCTATCTTC TTTAAAACAT GTCAACAAGC TGAGTGCCCA 
GGCGGGTGCC GAAATGGAGG CTTTTGTAAT GAAAGACGCA TCTGCGAGTG TCCTGATGGG 

TTCCACGGAC CTCACTGTGA GAAAGCCCTT TGTACCCCAC GATGTATGAA TGGTGGACTT 730 



6C0 
660 
720 



340 



60 GCAAACTGCT CAACCACCTG CTTTAATGGA GGGACCTGTT 7CTACCCTGG AAAATGTATT 900 



WO 98/54963 ^^CT/US98/1 14 



456 



15 



TSCCCTCCAG GACTAGAGGG AGAGCAGTGT GAAATCAGCA AATGCCCACA ACCCTGTCGA 96 3 

AATGGAGGTA AATGCATTGG TAAAAGCAAA TGTAAGTKTT CCAAAGGTTA CCAGGGAGAC 102 0 

5 CTCTGTTCAA AGCCTGTCTG CGAGCCTGGC TOTGGTGCAC ATGGAACCTG CCATGAACCC 1080 

AACAAATGCC AATGTCAAGA AGGTTGGCAT GGAAGACACT GCAATAAAAG GTACGAAGCC 114C 

10 AGCCTCATAC ATGCCCTGAG GCCAGCAGGC GCGCAGCTCA GGCAGCACAC GCCTTCACTT 1200 

AAAAAGGCCG AGGAGCGGCG GGATCCACCT GAATCCAATT ACATCTGGTG AACTCCGACA 1260 

TCTGAAACGT TTTAAGTTAC AC C AAGTTC A TAGCCTTTGT TAACCTTTCA TGTGTTGAAT 132 0 

GTTCAAATAA TGTTCATTAC ACTTAAGAAT ACTGGCCTGA ATTTTATTAG CTTCATTATA 1380 

AATCACTGAG CTGATATTTA CTCTTCCTTT TAAGTTTTCT AAGTACGTCT GTAGCATGAT 144 0 

20 GGTATAGA'IT TrCTTGTTTC AGTGCTTTGG GACAGATTTT ATATTATGTC AATTGATCAG 1500 

GTTAAAATTT TCAGTGTGTA GTTGGCAGAT ATTTTCAAAA TTACAATGCA TTTATGGTGT 1560 

CTGGGGGCAG GGGAACATCA GAAAGGTTAA ATTGGGCAAA AATGCGTAAG TCACAAGAAT 1620 

TTGGATGGTG CAGTTAATGT TGAAGTTACA GCATTTCAGA TTTTATTGTC AGATATTTAG 1680 

ATGTTTGTTA CATTTTTAAA AATTGCTCTT AATTTTTAAA CTCTCAATAC AATATATTTT 1740 

30 GACCTTACCA TTATTCCAGA GATTCAGTAT TAAAAAAAAA AAAATTACAC TGTGGTAGTG 1800 

GCATTTAAAC AATATAATAT ATTCTAAACA CAATGAAATA GGGAATATAA TGTATGAACT 1860 

TTTTGCATTG GCTTGAAGCA ATATAATATA TTGTAAACAA AACACAGCTC TTACCTAATA 1920 

AACATTTTAT ACTGTTTGTA TGTATAAAAT AAAGGTGCTG CTTTAGTTTT CTGA 1974 



25 



35 



40 



50 



(2) INFORMATION FOR SEQ ID NO: 204: 



(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 1057 base pairs 
45 (B) TYPE: nucleic acid 

(C) STRANDEDNESS : double 

(D) TOPOLOGY: linear 



60 



txi) SEQUENCE DESCRIPTION : SEQ ID NO: 204: 
CGGCCTTCCG GGGCAACCGT TCGTCCCAAC NCGGGAAAGG GTCCTGGAGN CGGGAACTAG 
GAGCCTCGGA AGTCCAAGGG CGGAGCGCCC TTTGCTAATA AGCCAATCAG AACGTGAGAC 120 
55 GCTCCGGTGG GNCGGTGCCG TCGAGCGCGG GGTGGAGTCT GGGTGACTTG GCTGGCGGGA 190 
TCAAGTGCAG CTGCTTCAGG CTGAGGTGGC AGATAGTGAG CGCTGGTGGC GGAGTTAAAG 240 
rfAAAGCAGG AGAGTAATWA TGAATAGCGC AGCGGGATTC TCACACCTAG ACCGTCGCGA 

60 



300 



WO 98/54963 



YUS98/11422 



GCGGGTTCTC AAGTTAGGGG AGAGTTTCGA 
CGCTATGACT TCAAACCTGC TTCTATTGAC 
5 GAAGKTGAAC AGKTGACCAT WACTCTGCCM 
TCAGTATCGT AAAGAACAAC AGCAACAACA 
GTAAAACATT CTCCATCTGA AGATAAGATG 

10 

AGAGAACTGA AGGCAGAAGC TAGTCTAATG 
GATTCCAAAA GTTCATCATC TTCAAGTAGT 
15 GATTGCAAAT CCTCTACTTC TGATACAGGG 
ACAGTACAGG ATTCCTGATA TAGATGCCAG 
TCTGATGAAT ACTTTAAGAA ATGATTTGCA 

20 

CTGAAGAAAT ATTTAGCTAT AAATAAAAAT 
AACAATAAAA ATTCCTAAGA CTGAGGGAAA 
25 TAAATTTGAT TCAGAAAAAA AAAAAAAAAA 



457 

GAAGCAGCC3 CGCTGCGCTT C'JACACTGTG 3 60 

ACTTCTTCTG AAGGATACCT TGAGKTTGGC 420 

AATATAGAAA GTTGAAGGAA GCAoTAAAAT 43 0 

ATGTGGAATT CASCCAGGAC TCCCAATCTT 540 

TCCCCAGCAT CTCCAATAGA TGATATCGAA 60 C 

GACCAGATGA GTAGTTGTGA TAGTTCATCA 660 

GAGGATAGTT CTAGTGACTC AGAAGATGAA 720 

NAATTGTGTC TCAGGACATC CTAC " ATGAC 780 

TCATAATAGA TTTGGAGACA ACAGTGGCCT 840 

GCTGAGTGAA TCAGGAAGTG ACAGTGATGA 900 

TTATACAGCA TGTATAATTT ATTTTGTATT 960 

TATGTCTTAA CTTTTGATGA TAAAAGAAAT 1020 

AACTCGA 1057 



30 (2) INFORMATION FOR SEQ ID NO: 205: 



35 


(i) SEQUENCE CHARACTERISTICS : 

(A) LENGTH: 721 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : double 

(D) TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 205: 






40 


GAATTCGGCA CGAGTCATCC CTCTCCCTCT TTCACTCCCT TACTCTTACT 


CTGTTTTTTG 


60 




TGCTCCAGAC AGACAGACCC TACCTCTTTT GCTTCTTTTT TGTTTGTTTG 


TTTTGAGATG 


120 


45 


GAGTGTCGCT CTTGTTGCCC AGGCTGGAGT GCAGTGGCGC AATCTCGGCT CACCACAACC 
TCTGCCTCCC GGGTTCAAGC AATTCTCCTG CCTCAGCCTC CCGAGAAGCT GGGGATTACA 


180 
240 




GGCATGCGCC ACCACACCCA GCTNAATTTT ATATTTTTAG TAGAGATGGT 


GTTTCTCCAT 


300 


50 


GTTGGTCAGG CTGGCCTCAA ACTCCCAACC TCAGGTGATN CCGCCTGCTT 


TGGCCTCCCC 


360 




AAAGTGCTGG GATTACAGGC GTGAGCCACT GCGCCCAGCC TCTTTTGCTC 


CTTTATACTC 


420 


55 


ATTAACTCAC GCCTGTAATC CCTGTTTTGG GAGGCCAAAG TGAGAAGGTT 


GCTTGAGGCC 


48C 


AAGAGTTTGA GACTAGCCTG GGCAACACAG CAAGATGCCA TCTTTATAAT 


AAAAATAAAA 


540 




ATAAAAATCA ATTAGCTGGG CATGGTGGAA CGCACCTGTA GTCCCAGCCA ATTGAGACrGC 


600 


60 


TGAAGTGGGA GGATCATTGA GCCCAGGAGT TGAGGTTGCA GTGAGCCATG 


ATCATGTCAC 


660 
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458 



3 



10 



30 



TACACTCAGC CTGGGCAATA GAGGGACATG TTGTCTCTAA AAAAAAAAAA AAAAAACTCG 



A 



40 



50 



(2) INFORMATION FOR SEQ ID NO: 206: 



720 



60 
120 
180 
240 



(i) SEQUENCE CHARACTERISTICS : 

(A) LENGTH : 2465 base pairs 
(3! TYPE: nucleic acid 
(C) STRANDEDNESS : double 
15 (D) TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTION : SEQ ID NO: 206: 

CCACCATTTA TCCAACTGAA GAGGAGTTAC AGGCAGTTCA GAAAATTGTT TCTATTACTG 

20 

AACGTGCTTT AAAACTCGTT TCAGACAGTT TGTCTGAACA TGAGAAGAAC AAGAACAAAG 
AGGGAGATGA TAAGAAAGAG GGAGGTAAAG ACAGAGCTTT GAAAGGAGTT TTGCGAGTGG 
25 GAGTATTGGC AAAAGGATTA CTTCTCCGAG GAGATAGAAA TGTCAACCTT GTTTTGCTGT 

GCTCAGAGAA ACC TTCAAAG ACATTATTAA GCCGTATTGC AGAAAACCTA CCCAAACAGC 300 
TTGCTGTTAT AAGCCCTGAG AAGTATGACA TAAAATGTGC TGTATCTGAA GCGGCAATAA 360 
TTTTGAATTC ATGTGTGGAA CCCAAAATGC AAGTCACTAT CACACTGACA TCTCCAATTA 420 
TTCGAGAAGA GAACATGAGG GAAGGAGATG TAACCTCGGG TATGGTGAAA GACCCACCGG 480 
35 ACGTCTTGGA CAGGCAAAAA TGCCTTGACG CTCTGGCTGC TCTACGCCAC GCTAAGTGGT 540 
TCCAGGCTAG AGCTAATGGT CTGCAGTCCT GTGTCATTAT CATACGCATT CTTCGAGACC 600 
TCTGTCAGCG AGTTCCAACT TGGTCTGATT TTCCAAGCTG GGCTATGGAG TTACTAGTAG 
AGAAAGCAAT CAGCAGTGCT TCTAGCCCTC AGAGCCCTGG GGATGCACTG AGAAGAGTTT 
TTGAATGCAT TTCTTCAGGG ATTATTCTTA AAGGTAGTCC TGGACTTCTG GATCCTTGTG 780 
45 AAAAGGATCC CTTTGATACC TTGGCAACAA TSACTGACCA GCAGCGTGAA GACATCACAT 840 
CCAGTGCACA GTTTGCATTG AGACTCCTTG CATTCCGCCA GATACACAAA GTTCTAGGCA 900 
TGGATCCATT ACCGCAAATG AGCCAACGTT TTAACATCCA CAACAACAGG AAACGAAGAA 960 
GAGATAGTGA TGGAGTTGAT GGATTTGAAG CTGAGGGGAA AAAAGACAAA AAAGATTATG 
ATAACTTTTA AAAAGTGTCT GTAAATCTTC AGTGTTAAAA AAACAGATGC CCATTTGTTG 
55 GCTGTTTTTC ATTCATAATA ATGTCTACAT TGAAAAATTT ATCAAGAATT TAAAGGATTT 
CATGGAAGAA CCAAGTTTTT CTATGATATT AAAAAATGTA CAGTGTTAGG TATTATTTGA 
ATGGAAAGAC ACCCAAAAAA AAAAATGTGC TCCGACTAGG GGGAAAACAG TAGTTCCGAT 12 6C 

60 



660 
720 



1020 
1080 
1140 
1200 
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40 



45 



55 



(2) INFORMATION FOR SEQ ID NO: 207: 



(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 1480 base pairs 
50 (B) TYPE: nucleic acid 

(C) STRANDEDNESS : double 

(D) TOPOLOGY : linear 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 207: 
GAATTCGGCA CGA3CTCAAG CTGGCAGGTG GTCGGGGGAG CGGCCGGAGA GGAGCTGCCG 
GGAGTTCGTG CCCTGCAGGA CATGACACCA GTGGCATATC ACGGCCATGG GGTCTCAGCA 



459 



TTTTTCCCAT TATTTTTATT TTATTTTCTG GTTGCCCTAG CT7CCCCCC:: TATTTTTGTG 13',:0 

TCTTTTATTA ACTAGTGCAT TGTCTTATTA AATCTTCACT GTATTTAATG CAGGATGTCT 13 g 0 

5 GCTTCAGTTG CTCTGTGTAT TTTGATATTT TAATTTAGAG GTTTTGTTTG CTTTTTGACA 1440 

CTAGTTGTAA GTTACTTTGT TATAGATGGT ATCCTTTACC CCTTCTTAAT ATTTTACAGC 15 00 

AGTACGTTTT TTTGTAACGT GAGACTGCAG AGTTTGTTTT TCTATATGTG AAGGATTACA 1560 

10 ACACAAAAAG TTATCCTGCC ATTCGAGTGC TCAGAACTGA ATGTTTCTGC AGATCTTGTG 1620 

GCATTTGTCT CTAGTGTGAT ATATAAAGGT GTAATTAAGA CAGAGTTCTG TTAATCTAAT 1680 

15 CAAGTTTGCT GTTAGTTGTG CATTAGCAGT ATAAAAGCTA ATATATACTA TATGGTCTTG 1740 

CAACAGTTTT AAAGCCTCTG CATAATTGAT AATAAAAATG CATGACATTC TTGTTTTTAA 1800 

TAGACTTTTA AAATCATAAT TTTAGGTTTA ACACGTAGAT CTTTGTACAG TTGACTTTTT I860 

20 

GACATAGCAA GGCCAAAAAT AACTTTCTGA ATATTTTTTT CTTGTGTATA AGTGGAAAGG 1920 

GCATTTTTCA CATATAAGTG GGCTAACCAA TATTTTCAAA AGAACTTCAT CATTGTACAA 1980 

25 CTAACAACAG TAACTAGCCC TTAATTATGG TGACAGTTCC TTATTGGTGT GTGTGAGATT 2040 

ACTCTAGCAA CTATTACAGT ATAACACAGA TGATCTTCTC CACACACCCC ATCACCCAGA 2100 

TAATTTACAG TTCTGTTAAG AGTGAGGTTG ATAAAGTATT ACTGATAAAA AATTATCTAA 2160 

30 

GGAAAAAAAC AGAAAATTAT TTGGTGTGGC CATCTTACCT GCTTATGTCT CCTACACAAA 2220 

GCTAAATATT CTAGCAGTGA TGTAATGAAA AATTACATCT TACTGTTGAT ATATGTATGC 2280 

35 TCTGGTACAC AGATGTCATT TTGTTGTCAC AGCACTACAG TGAAATACAC AAAAAATGAA 2340 

ATTCATATAA TGACTTAAAT GTATTATATG TTAGAATTGA CAACATAAAC TACTTTTGCT 2400 

TTGAAATGAT GTATGCTTCA GTAAAATCAT ATTCAAATTT AAAAAAAAAA AAAAAAAAAA 2460 
CTCGA 



2465 



6C 
120 



60 TTCCGCTGCT GCTCGCCCCT CCTCCTGCAG GCGAAAGCAA GAAGATGACA GGGACGGTTT 180 



WO 98/54963 ^■CT/l!S98/l 1422 



460 



GCTGGCTGAA CGAGAGCAGG AAGAAGCCAT TGCTCAG'ITC CCATATGTGG AATTCACCGG 240 

GAGAGATAGC ATCACCTGTC TCACGTGCCA GGGGACAGGC TACATTCCAA CAGAGCAAGT 300 

AAATGAGTTG GTGGCTTTGA TCCCACACAG TGATCAGAGA TTGCGCCCTC AGCGAACTAA 36 C 

GCAATATGTC CTCCTGTCCA TCCTGCTTTG TCTCCTGGCA TCTGGTTTGG TGGTTTTCTT 420 
10 CCTGTTTCCG CATTCAGTCC TTGTGGATGA TGACGGCATC AAAGTGGTGA AAGTCACATT 



15 



25 



35 



50 



(2) INFORMATION FOR SEQ ID NO : 2 08: 



(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 872 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : double 
55 (D) TOPOLOGY : linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 208: 

CAG-ATTTCC CTCAGTACTG TAAGCAAAAG TGGTATGTTT TTCTTTCTTT ATGTCTACTC 

60 



480 



TAATAAGCAA GACTCCCTTG TAATTCTCAC CATCATGGCC ACCCTGAAAA TCAGGAACTC 540 

CAACTTCTAC ACGGTGGCAG TGACCAGCCT GTCCAGCCAG ATTCAGTACA TGAACACAGT 600 

GGTGAATTTT ACCGGGAAGG CCGAGATGGG AGGACCGTTT TCCTATGTGT ACTTCTTCTG 6 60 

CACGGTACCT GAGATCCTGG TGC AC AACAT AGTGATCTTC ATGCGAACTT CAGTGAAGAT 72 0 

20 TTCATACATT GGCCTCATGA CCCAGAGCTC CTTGGAGACA CATCACTATG TGGATTGTGG 780 

AGGAAATTCC ACAGCTATTT AACAACTGCT ATTGGTTCTT CCACACAGCG CCTGTAGAAG 840 
AGAGCACAGC ATATGTTCCC AAGGC CTG AG TTCTGGACCT ACCCCCACGT GGTGTAAGCA 
GAGGAGGAAT TGGTTCACTT AACTCCCAGC AAACATCCTC CTGCCACTTA GGAGGAAACA 
CCTCCCTATG GTACCATTTA TGTTTCTCAG AACCAGCAGA ATCAGTGCCT AGCCTGTGCC 



900 
960 

1020 



30 CAGCAAATAG TTGGCACTCA ATAAAGATTT GCAGAATTTA ATACAGATCT TTTCAGCTGT 1030 



1140 
1200 



TCTTAGGGCA TTATAAATGG AAATCATAAC GTGGTTCTAG GTTATCAAAC CATGGAGTGA 
TGTGGAGCTA GGATTGTGAG TGACC TGC AG GCCATTATCA GTGCCTCATC TGTGCAGAAG 
TCGCAGCAGA GAGGGACCAT CCAAAT AC CT AAGAGAAAAC AGACCTAGTC AGGATATGAA 1260 
TTTGTTTCAG CTGTTCCCAA AGGCCTGGGA GCTTTTTGAA AAGAAAGAAA AAAGTGTGTT 1320 
40 GGCTTTTTTT TTTTTTAGAA AGTTAGAATT GTTTTTACCA AGAGTCTATG TGGGGCTTGA 1380 
TTCACCCTTC ATCCATTGGC TGGAACATGG ATTGGGGATT TGATAGAAAA ATAAACCCTG 1440 
CTTTTGATTC AAAAAAAAAA AAAAAAWAAA AAAAACTCGA 1480 

45 



60 



WO 98/54963 
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461 



10 



20 



30 



45 



55 



(2) INFORMATION FOR SEQ ID NO: 209: 



(i) SEQUENCE CHARACTERISTICS : 
35 (A) LENGTH : 1779 base pairs 

(B) TYPE: nucleic acid 
{ C ) STRANDEBNESS : double 
<D) TOPOLOGY: linear 

40 (xi) SEQUENCE DESCRIPTION: SEQ ID NO : 209: 

AATTGCCAAG ACTGCACAAA ATTACAGTGC TAATGTATAT GGTTGCAGTT CACATAAAGA 



50 GGGTCTGTTC CTTCTTGAGC CTGCCTGCAG GGATGGTCTC CTTTTAAAGC AGGTTGTGTG 



120 



TGTCCTCTGT GGCCT7CTGG 7GTACCCCTC TCTTCCTAGC CATTCAG7CT CTCTAGTC AC 

CTCCCTAGTA GCTAGTGCTC TCTAAGTTTT TATTT AATT A GAA'CAACTCC ATTTCCATTT 16C 

CAAGG7AGGT CAATGGGGGG AAAAGCCTCA TGATTTAAAC TGAAGTTAAC AACACAGCTT 240 

TTAAAATGAA AACTCATACT CCAACTTCTA AAGTATATTT GAGCTGATTT GTTTCCAAAA 3 00 

CAAAGATATG CTGTACCTAA AACTGCTAAA ACAAAAATAT AAAGACAAGG ACTAGGTGAT 360 

T AAGGGGAG A GAAAAATCAT YTCTTTTCCA GGAAACCTTT GCTAAAATAA GCAAAACTTG 420 
ANTCTATGCT TCATGGAAAC TGACACAAAG AAAAGAAACT GATGGATTGC ACAGGCCTTG 



480 



15 TTATAGAAAT AGATCTATAA AAAGATCTGT CCACAGGAAA TATACACCTT CTCCTGGTTC 540 

TGAACTTCAA TGGGGATTTG TCACCTAGGT CTC CATC TAT AGGAATACCT TCACATACCT 600 

ATCTATTCAT GCACATATTC TGAAAACAGG TACATACAAA ATTACAACAA AGGAAAAAAA 660 

TTCTATTGAA CACTTAAAAA TAGAAACAGG CCAGGCACGG TGGCTCATGC TGTAATCCCA 72 0 

ACAATTTGGG AGGCTGAGGC TGGTGGATCA CCTGAGGTCA GGAGTGTGAG ACCAGCTTGG 780 

25 CCAACATGGT GAAACCCCGT CACTACTAAA AATACAAAAA AAATTAGCCT GTGTGGTGGC 340 

ACACTCNTAC AATCCNGGCT GACTCGGGAA AN 872 



60 



CAAAAGCATC TGTTATGAAA TGAGTAGTAA TATTGGGTGG TTGATTTGTT CTTAGCAGAC 12 0 

TTGGCTTCAT WTTGGTCTTG AGATAAAATG GCCAGCATAA ATGCTGTTTA TATTCACGTT 



130 



TTCCTAGGTG TGTGTGTGCA GGCCACAGCA GCATGCCCTT GGTGTAGTCA GTGCCGAAAS 240 



300 



CAGCATTCAG TACACTGAAG GTAAGCTAAA CCATCAACAT CTCTGGTGTT TTAAGATGTT 360 



420 



ATTTTATTGG AACAACTGAC AAATGAGGGA TGTTAGCTTT GTGGCAGAAT TCCCTGCATG 
TGTGATAACT GATCTTGTTT TATTTTTTGG CATTGCAACT GTGGCATAGT TACAATTTCT 480 
GTTTGKTCAT CACATTTAAA ATTGGRAGAG AACGCGCTTG AKGGATAGAG CCCCTTCAGK 540 
60 GTACTGTTTC TTATTAACTT TACTTTTTTT AAATCAACTT GCTATAGACT TTATATACAT 



600 
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35 



TTTGTTAAAT ATAGTTCCTA GTGACATAGA AACGATGCGT AGTTTTCATT TACTAATTAC 66 C 

AAATGTTGAG GCCTAATTCT GAAAGTCCTC ATATTTAAAG GCTAGACAAC GTAATGAAAT 72 0 
TTTTAACTAT TTGTATGTCA TTTTGAAAGT GTACTGCTTT ATGGTAAAAG TGTTTTTCAT 
TTGTTCATTG TTTTCATTAT TTGTGATCAT GTTGTCTTTC AATACAGGCA TAAACCTTCC 



780 
840 



10 ACTCTTGAAC AAAGCAGCTG CTTTTTAAAA GCGGTAATTG CTTCTTTACC TTTTATTTCT 900 

TTTGTAAATG AAGCTTTTCT TTAAGAATGT GACTTTAAAG TGTTGTCTAT TGCATAAAAC 960 

AGTTGACACT CACTTATTGT AAAGTGAAGA TTGTTCTACT GCATGTGAAG TGGACCATGC 1020 

15 

AGATTTCTGT ATGTTCTC AG TATGCATCAC TAGATAATAA AGTCTTTTGT GAACAAGGCA 1080 

TTTGTAGCCA TTTTTAAAAG TTTTTGTCTT CAGTGCTGGT AAGTCAGGTA AACCATAAAT 1140 

20 AGTTAAAAGC AACCTTTTGT TTTTTTCCTG AAAGTTTTTA ATTGAAAGTA TTATTAGTTA 1200 

AAGATGTAAA CCTAGCCAAA ATTACCAGTT TATTAATAAT TAGGATCCTA ATTATTTCAA 1260 

AAAATCCTAC AAATATTGTC AGCTTTCAGT GTAGTGAGAT TATTCCTGTA GGTTATGGGG 132 0 

25 

TATAATTCAG GATTTAACTA ATGTTTCTGC TATTTTCTCA CTTTTCCTTT TGATGGTGCG 1380 

GAAAGAGAAA AAGGAAAACG GGGCACAGGC CATTCGACGC CTTCTCCAAG GGGTCTGATT 1440 

30 TGCTGAGACA CCAGCTTCAC CTTCTTAACA AGGCACCTAA TTACAACAAG CATGCACATT 1500 

TTGGTGCATT CAAGAATGGA AAATCAGAAT AGCAGCATTG ATTCTTCTGG TGCAGCTCAG 1560 

TGGAAGATGA TGACAACCAG AAGACATGAG CTAAGGGTAA GGGACTGTTC TGAAGAACCT 162 0 

TTCCATTTAG TGATCAAGAT ATGGAAGCTG ATTTCTGAAA ATGCTCAGTG TGTACTCTAA 1680 

TTATTTATGG TACCATTTGA ATTCTAACTT GCATTTTAGC AGTGCATGTT TCTAATTGAC 1740 

40 TTACTGGGAA ACTGAATAAA ATATGCCTCT TATTATCAA 1779 



45 (2) INFORMATION FOR SEQ ID NO: 210: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH : 2110 base pairs 

(B) TYPE: nucleic acid 
50 (C) STRANDEDNESS : double 

(D) TOPOLOGY: linear 

(Xi) SEQUENCE DESCRIPTION: SEQ ID NO: 210: 
55 GCGGCCGCTG CAGCCCGGAG CTGAGCTAGC CGTCCGAGCC GAGCCGTCCG AGCCGGGGAA 
GCCGGCGCGT GCTGCCGCTC GTGGCGGCCA GAGGAGAGGA GAGGCAGCAG CATGGCGAGT 



60 
120 



60 



GTCCTGTCCC GACGCCTTGG AAAGCGGTCC CTCCTGGGAG CCCGGGTGTT GGGACCCAGT 180 
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10 
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30 



40 



50 



540 
600 
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900 
960 
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■3CCTCGGAGG GGCCTCGGCT GCCCCACCCT CGGAGCCACT GCTAGAA3GG GCCGCTCCCZ 24 3 

AGCCTTTCAC CACCTCTGAT GACACCCCCT GCCAGGAGCA GCCCAAGGAA GTCCTTAAGC 300 

CTCCCAGCAC CTCGGGCCTT CAGCAGGTGG CCTTTMAGCC TGGGCAGAAG GTTTATGTGT 360 

GGTACGGGGG TCAAGAGTGC ACAGGACTGG TGGWGCAGCA CAGCTGGATG GAGGGTCAGG 42 0 

TGACCGTCTG GCTGCTGGAG CAGAAGCTGC AGGTCTGCTG CAGGGTGGAG GAGGTGTGGC 480 
TGGCAGAGCT GCAGGGCCCC TGTCCCCAGG CACCACCCCT GGAGCCCGGA GCCCAGGCCC 
TGGCCTACAG GCCCGTCTCC AGGAACATCG ATGTCCCAAA GAGGAAGTCG GACGCATGGA 

15 AATGGATGAG ATGATGGCGG CCATGGTGCT GACGTCCCTG TCCTGCAGCC CTGTTGTACA 660 

GAGTCCTCCC GGGACCGAGG CCAACTTCTC TGCTTCCCGT GCGGCCTGCG ACCCATGGAA 72 0 

GGAGAGTGGT GACATCTCGG ACAGCGGCAN CAGCACTACC AGCGGTCACT GGAGTGGGAG 780 
CAGTGGTGTC TCCACCCCCT CGCCCCCCCA CCCCCAGGCC AGCCCCAAGT ATTTGGGGGA 
TGCTTTTGGT TCTCCCCAAA CTGATCATGG CTTTGAGACC GATCCTGACC CTTTCCTGCT 
25 GGACGAACCA GCTCCACGAA AAAGAAAGAA CTCTGTGAAG GTGATGTACA AGTGCCTGTG 
GCCAAACTGT GGCAAAGTTC TGCGCTCCAT TGTGGGCATC AAACGACACG TC AAAGCC CT 

CCATCTGGGG GACACAGTGG ACTCTGATCA GTTCAAGCGG GAGGAGGATT TCTACTACAC 1080 

AGAGGTGCAG CTGAAGGAGG AATCTGCTGC TGCTGCTGCT GCTGCTGCCG CAGACCCCCA 1140 

GTCCCTGGGA CTCCCACCTC CGAGCCAGCT CCCACCCCCA GCATGACTGG CCTGCCTCTG 1200 

35 TCTGCTCTTC CACCACCTCT GCACAAAGCC CAGTCCTCCG GCCCAGAACA TCCTGGCCCG 1260 

GAGTCCTCCC TGCCCTCAGG GGCTCTCAGC AAGTCAGCTC CTGGGTCCTT CTGGCACATT 1320 

CAGGCAGATC ATGCATACCA GGCTCTGCCA TCCTTCCAGA TCCCAGTCTC ACCACACATC 1380 

TACACCAGTG TCAGCTGGGC TGCTGCCCCC TCCGCCGCCT GCTCTCTMTC TCCGGTCCGG 1440 

AGCCGGTCGC TAAGCTTCAG CGAAGCCCCA GCAGCCAGCA CCTGCGATGA AATCTCATCT 1500 

45 GATCGTCACT TCTCCACCCC GGGCCCAGAG TGGTGCCAGG AAAGCCCGAG GGGAGGCTAA 1560 

GAAGTGCCGC AAGTGTATGG CATCGAGCAC CGGGACCAGT GGTGCACGGC CTGCCGGTGG 1620 

AAGAAGGCCT GCCAGCGCTT TCTGGACTGA GCTGTGCTGC AGGTTCTACT CTGTTCCTGG 1680 

CCCTGCCGGC AGCCACTGAC AAGAGGCCAG TGTGTCACCA GCCCTCAGCA GAAACCGAAA 1740 

GAGAAAGAAC GGAAACACGG AGTTTGGGCT CTGTTGGCTA AGGTGTAACA CTTAAAGCAA 1800 

55 TTTTCTCCCA TTGTGCGAAC ATTTTATTTT TTAAAAAAAA GAAACAAAAA TATTTTTCCC 1860 

CCTAAAATAG GAGAGAGCCA AAACTGACCA AGGCTATTCA GCAGTGAACC AGTGACCAAA 1920 

GAATTAATTA CCCTCCGTTT CCCACATCCC CACTCTCTAG GGGATTAGCT TGTGCGTGTC 1980 

60 



WO 98/54963 "T 1 S98 1 1422 



+ 1 



46- 



2 04^ 



AAAAGAAGGA ACAGCTCGTT CTGCTTCCTG CTGAGTCSGT GAATTCTTTG CTT7CTAAAC 
TCTTCCAGAA AGGACTGTGA GCAAGATGAA TTTACTTTTC TTAAAAAAAA AAAAAAAAAA 2100 

2110 

5 AAAAACTCGA 



25 



35 



45 



60 
120 



10 (2) INFORMATION FOR SEQ ID NO: 211: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 93 3 base pairs 

(B) TYPE: nucleic acid 
15 (C) STRANDEDNESS: double 

(D) TOPOLOGY : linear 

(Xi) SEQUENCE DESCRIPTION: SEQ ID NO: 211: 
20 GGCACAGGAA AAAAAAGAAA AAAGAAAAAA GAAAAAAGTT TTTGTACCCA CAGATTAGCA 
TTTTCTTGAT GTTTGAAAAA AGTTTAAGCT ATGTCCTAAT TTAAAAATGA GCACAAACTA 

CTTAACAGAT GTCTGTTCCC TCTTCTCTTA CTTAAATTAT CTTTATTTTC ACCATCACCT 180 

CCCAGTGCCG AACACCTGAN CTCTGTGTTT TGTGGTTGGA TCCTGGGTTG CCAAGTTCCT 240 

ATTTGGTCAG TCCCTGGCCT GTGGGGCGGT CTCAGGAAGT GGCATGCTCT TCAMGRAGGA 300 

30 TCGTTCATYT CCAGTATAAC C AWTTTGTT A ATAATAGTTG ATAATTCCCA GCTTTTACCA 360 

GATGARTTTT GACTTATTTT TCCTCCTTTG ACCTGTTCAA AGCTAACATA TCTCGGTCAG 420 

TTCGGAGAGG GTGGGGGATT TGAGAATGTG AGGAGGAGTG GGGTTAGAAT GGGTTTGCCT 480 

ATCTGGGCAA GGAAAGAGTT CCTAGTCGAT TGGGCACAAT GACAAAATGA TTCCATGGAT 540 

AGAATCGTCC CATGTTGCTG GAACACCTCA CGTGTTGTGA ACGCCTTAAA TTCCTGCCAT 600 

40 CCCTTCTCTG ATTCCCCACC TCCCTGTAGT TTCCACAGGA TTTATCTCTC TGTACCCCCG 660 

TCCTCCAACT CTACTCTGTC AGCCTCTCCT CCATCCCTTA CTTCCCTTCT AAATTCCAGG 720 

AGATGACCTC ACTTTTGCAAA GCAAATTGGA GCCACCAAAT TGTAGCTCTC CTCGGTGGAA 780 

ACTGCATCTG TGCTCATCCC TGCACCTTCT TGCAGAAAGC CGCCCCCTCA GGCCAAGATG 840 

AGTGCCTGGC CCCCATGGGA GACTCAGACA CTTTGACCCC TTCTGACTTC AGCATCTCCC 900 

50 TCTTTAAAGA TTCTCTCCCA ACATTCAGTC GTGCTCGA 938 



55 (2) INFORMATION FOR SEQ ID NO: 212: 

(i) SEQUENCE CHARACTERISTICS : 

(A) LENGTH: 1551 base pairs 
(3) TYPE: nucleic acid 
60 (C) STRANDEDNESS: double 
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( D ) TO POLOGY : 1 inear 
(xi) SEQUENCE DESCRIPTION: SEQ ZD NC: 212: 
5 AGGCTGGACT AAGCATAGAG AACCAGGAGA GAAAGAAAGA TTTAAGAGAC TGAGTAATAT 50 
TTTTTGACAG ATCATTTAAG AAACTGAGTA ATTTTTTTTT TCTCCAAAAG GGCATOGG7T 120 
TTTTTTTTGT TTTGTTTTTT CTCTATTTGG CACTTTCTAG GGATTGGTCT ATA_AAT*TTTT 130 

10 

TGAAAGATCA TAGGATAAAT TTCTTTGTAG CAACTTCCTA TTTTAGTGTT TATGTTAGGG 
GARCCCCARG TGTCCCTGCT GATACGCCAT TAGGGCCACT TCTCAGCCTC TGGCTACATC 3C0 
15 ATAATGCTTT TTTTTCTATC TTGCCAAAGT TTCCMGAAAA TTKAKGTTTT CTAATTTTAA 350 
AAAAATTGGT TGTGGAGATG GGATGGGACC TCTTTATAAG CCCTGAAAAT AAGTSATTTN 420 
TTTTAAGTGC TATTCTGCTA TAAACCTGAT TCTCACTTTT TTCTGTAGAC AACAGTTTTT 430 

20 

TATAATATAT CTATTTTGTG TGGACATTAT TTCCTTTTAA CCAATACTGA AATTCCATAG 540 



TGTAWACTTT CTCCACATTT TCTTTGATTA ATACTTYCTT AAAATAGACA CTTGGATTGG 



GAGAATTTTT CTGGAATGCT TAGTTAGGGA TGAAATTGCT GGGTTATAGG TATGAGTATG 
CTTGATATAC TTTTCTCCAG AATGTCTACA CCTGTGTGTA CACCACATCT CCAGAGATAG 

30 

GGGAATCTTA TGTCCCTGCT AACTGCTCTC GTTATTTAAT TTTCTGACAT TTGCCGCCGC 
CGCCGCCCCC TGCCCCCAAC ACACACATGG TATAAAGTGG TAGTTTCTTG TTTTAAATTG 
35 AACTTTTGAA TGATTTGAAT TTGGGCATTT CTTTGTATCC TGAGTTATTT TGGTTTCCCG 
TTATGTGAAT ATCCTTTTCC TATGCTTTAA CTACTTTTCT AATTTGTCCC TTTTTTNGGT 



5 30 



25 CACCAGCTGT CACCAATAAA GCTGCCCTGA ACATTGTCAA TCAATCCTGT TAACCAATTT 65 



720 
730 
840 
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TATCAAATTC CAGGCCATTG TCTATTCCAT CGTCACTTTT GGGTATTGGA AACATCTTTC 1C30 

40 

CATTCTGTAG CCTGTCTGTT GAACATAAAT CTTGATTTTT ATGTAATCAG ATTTTTCTCC 1140 

TTACGGTTAT GTTCTTGGAA TTTTATTTAA GAAATCTTTT TCTATCCTGA GACCACAAAA 1200 

45 ATGTCCCCAC CATTTTCTTC TGTTTCATAG TTTTGCCTTG TATGTTTAAT CCTTTAAGGC 1250 

ATGTGTAGTT CATTTTATAT GGTGTGAAAT AGTTCTTATT CATTTATTCA ACACATATTG 1320 

GTGGAGTGCC TGCTGATGGT AGTACTCTTC AGAGTACTTT GTATATATTT GTGAACACAT 1330 

50 

ATTCTTGCCC TGGAAGCTTA TGTTGTCNTT CAAGGTAGAT CCNTACTCGG TTTCCACCTG 1440 

TTTTCTTCAG CCCTCAGGAT GAATTCCACA ATTTTACACA TAGCACCAGT TAAGGAATAG 1500 

55 GCTTTATTGG AGAAAAGGAA GGCTTATTAG ACCAGCATCA GCAAAAAAAA A 1551 



60 (2) INFORMATION FOR SEQ ID NO : 213: 



GGAAACCACT AAATTCCACT TGACAAACCA 



AAACTTTCTC TTTGGCACCA TATGATTCTG TTACATTAGG C-rrC-.rC-.AT 3CTAAGA7AC 
ACAGCTAGGT CTACCAGCTG CCAGTGGTCA AGAACGAAAG AACCTCTC-.G AGAGAGATC-. 
GTTTCTAATA ACCTAACAGT TTTCCTTGG3 TATTACMAAA AAAAA^AAAA TTAGAATAA.-. 
ATGTCAGTGC CATGCAGGCA AGTACAGATA TGGAAATGAA AGCTCTGTCT ACAACTG-CAA 



20 
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■i66 

■ i ) S EQUENCE CHARACTER I ST I C a : 

A) LENGTH: 997 case pairs 
:B) TYPE: nucleic acid 
tC) STRANCECrCESS : double 
:D) TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTION: SEC. ID:;C: 212: 

AGAGAGTCCT CAACAGAACC TAATCATGCT GGCAJCCCT.-A 7CTCVTACT7 CTAGCCTC 

GAACTGAGAG AACATAAACT CCAGTTGT7T AAGCTACCC-. G^C7A70GTA TTTGTTA: 

T AGCCC AAG 0 TAAGTCAGGT GGAAAGGCAG AA\TA7TT7G ACA-GA?.7C-. 7TTCTAC-AA 1 30 

AACAGAGTTG TTCTAAATGA AATGGC CAGA TATTTCATCT 7C7TCV7ACT AGTA7TCA7G 24 0 

AAAGTTTCAT TAAACACCAC TTGGCCAGCA CCCAGGCCTG CC-.CCCTCAC- AACGGCA---.C 

AAAAGCAAAT GATTTGAGGA ACAAAAGAGT GGACACAGAG CCTC7CAGAA GATGGCTC 

TCTTCTGAGA TGATCTTCTG AGATCATCAA TTTTCTGC-.C C7CA7GCC 77 ACTCCAA7TG 

TAGTAGATAA GAGCAAAGAC ACTTCCTGAT CCTG7GGAAA A7GC7 GGAC-C CCTGCICATG 

GAGAGGCTGA CACTGGGACC AACAGAAGGC CGGACATTTA 77TC-77GCAG tCCTTCTCC-. 

CCTGGGCCCT CTTCAGGCCT TGTACCTTC-C ACTCCCCA7G CCAC7G7AGC ACCTC-GTAAG 

CTGAAGTTAG GTATTTGAAG AGATAATTTG CCCCCAAC-A AGAA7C ACTT AAAAGAAAAA 



300 
360 
420 
430 
54 0 
600 
550 



7GCAAA7TTG 720 



730 
340 
900 
950 



GAT 



TTAATAAAAT TGATTGGGAT CACTCGA 997 



(2) INFORMATION FOR SEQ ID NO: 214: 

(l) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 14 96 base pairs 

(B) TYPE: nucleic acid 
(CJ STRANDEDNESS : double 
(D) TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID :;C: 214: 

GAATTCGGCA CGAGTGACCA CAGATATCTT TGGCTTTC-G CC7C-.CCACA ATGCTGTCCA 

CTATGTTTTT TTTAATCGAT TGACATCTCA TGAATCCACA AATTT AGCCG CTTTTCCATC 
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TTTTCCATCT TTGTCATAGC TTCATCACGC ACGATGGAGG TCACTTCAGC ACTATCCGGA 180 
3CGGCCTCAC GGACAGATCR GTGAATTTCC TTTTCCITTTT TCTTGATGTA CCGGATTGTC 24C 
GACTCGTTAA CATTGAGCTC ATGGCCAACA GCACTGTAAC TCATGCCTGA TTGGAGCTTA 
TCCAACACGC C^GAMTTTCTC CGTAAGGSAM ATCAMGGTCT TCTTTCGCTT AGGAACACTG 



300 
360 



GGCARARCTT AARCACTAC3 CTTGGGGGCC ATTTTAGAAA GCAAAACCAC CCACAAAAAG 42 0 

10 CAGAAAAAAA AGTGTCAGTA AACAGACTGN NGANAGGACT CTTTGTTTAC AGCACAGGAG 48 0 

CTGCGACTAG AAGGCGGCGC TTCTCCCCAG TTCAAACTTC AGCTGGGAAC CTTACCTCCG 540 

15 CCAACTCCAA ATTTTCACCC TCTGCGCATG CCCGGGAAAS AAACCCCCAG AACAGTACCG 600 

TGATGATTGA TTTTAGGGTT ACAAATACAT TTTAGCAAGT AAGTGAATTT GGCATTACGA 66 0 

ATTAATGATT AATGAAGGTC ACCTGTATTT CCATAGATAT GTAATTTTAT TTAAGCAGGT 720 

20 

TTATTATATT AAGGCGGSGA GGCAGCGCCG AAGACTACAA GTTCCAGCAT GCACCGCGTC 780 



840 
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CGGGCGGGTT CGGGCTCCCA GCGAGGGCTT CAGGGACGCC AGC CCGGAGG CATCGGCCGG 
25 AAGTGTCGTA GGGCAACCAC GTAGTACTCT CTGCGCATGT GCAAAGCGCT GTCGGGGGCC 
GCCCTAGCTG CCGTCGCCGC CGCCGGGGCT CTATGGTCTC TCCCTAGAGC TTTGCCGTTG 
GAGGCGGCTG CTGCGGTCTT G TG AGTTTG A CCAGCGTCGA GCGGCAGCAA CATGGAGGAA 

30 

TTCGACTCCG AAGACTTCTC TACGTCGGAG GAGGACGAGG ACTACGTGCC GTCGGGTGAG 1080 

CGATTCCGCC TGAGGCGAGA AGCGAATTGC CCCGCCCCAC GCCTCACGTG AGGCGCGCTC 1140 

35 TGCCCCCGCG GGCGTCTGCC CTGTGGCCCA GGTGGTCCAG GGGGGCTCCT GTTCTCGAGC 1200 

GTCCGCTCCC TCAGGCCCCT CATCCTCGGC CGCTCCGGCC CGAGGCGTGT GCGCGTGGCG 1260 

GTTCTGTGCT CCCCTCCCGT TGGGCAGCTC CGGCCGCCGC CCCCTCTTGC AGCGCGGGAA 1320 

40 

CGGCACATGG ACACGGCCCC TTGTCGCTAG GGACGCTCGT CGGTCAGCCC CGAACGACAA 1380 

CGCTGCTTCA GAAGTCGGGG CGGCAGTTCG AGCCTTGGAA GTTTTTTTCA GCCCTGGCCC 144 0 

45 GAGAGAGCTG CTGGCCAACA ACCCGTCCAA GATAGAGCTG TCCGNTCTCC GNCTGG 1496 



50 (2) INFORMATION FOR SEQ ID NO: 215: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 1308 base pairs 
{ B) TYPE : nucleic acid 
55 (C) STRAND EDNESS : double 

(O) TOPOLOGY : linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 215: 



60 TTGGCANCNG GGAGAGGGAA AGAGGAGGAA ATGCGGTTTG AGO AC C ATGG CTTACCTTTC 



60 
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C^GCCTTTGA CCCATCACAC CCCATTTCCT CCTCTTTCCC TCTCCCCGCT GCCAAAAAAA 
AAAAAAAAGG AAACGTTTAT CATGAATCAA CAGGGTTTCA GTCCTTATCA AAGAGAGATG 

5 

TGGAAAGAGC TAAAGAAACC ACCCTTTGTT CCCAACTCCA CTTTACCCAT ATTTTATGCA 
ACACAAACAC TGTCCTTTTG GGTCCCTTTC TTACAGATGG ACCTCTTGAG AAGAATTATC 
10 GTATTCCACG TTTTTAGCCC TCAGGTTACC AAGATAAATA TATGTATATA TAACCTTTAT 360 
TATTGCTATA TCTTTGTGGA TAATACATTC AGGTGGTGCT GGGTGATTTA TTATAATCTG 420 
AACCTAGGTA TATCCTTTGG TCTTCCACAG TCATGTTGAG GTGGGCTCCC TGGTATGGTA 
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(2) INFORMATION FOR SEQ ID NO: 216: 



130 
240 
300 



480 



AAAAGCCAGG TATAATGTAA CTTCACCCCA GC CTTTGT AC TAAGCTCTTG ATACTGGATA 540 



600 



840 
900 



TACTCTTTTA AGTTTAGCCC CAATATAGGG TAATGGAAAT TTCCTGCCCT CTGGGTTCCC 

20 CATTTTTACT ATTAAGAAGA CCAGTGATAA TTTAATAATG CCACCAACTC TGGCTTAGTT 660 

AAGTGAGAGT GTGAACTGTG TGGCAAGAGA GCCTCACACC TCACTAGGTG CAGAGAGCCC 720 

AGGCCTTATG TTAAAATCAT GCACTTGAAA AGCAAACCTT AATCTGCAAA GACAGCAGCA 780 
AGCATTATAC GGTCATCTTG AATGATCCCT TTGAAATTTT TTTTTTGTTT GTTTGTTTAA 
ATCAAGCCTG AGGCTGGTGA ACAGTAGCTA CACACCCATA TTGTGTGTTC TGTGAATGCT 

30 AGCTCTCTTG AATTTGGATA TTGGTTATTT TTTATAGAGT GTAAACCAAG TTTTATATTC 960 

TGCAATGCGA ACAGGTACCT ATCTGTTTCT AAATAAAACT GTTTACATTC ATTATGGGGT 1020 

ATGTATGACC TTCATTTTCC AAGAAATAGA ACTCTAGCTT AGAATTATGG ATGCTCTAAA 1080 

ATGTCAGAAT GGGAACTCTC CTCGAAGTTC TCCCAAACTC AGAGACAGCA CTGCCTTCTC 1140 

CTAAATGATT ATTCTTTTCT CCCTGTTTTC TGGTATTTTC TAGGCATCCT TCTCACCACA 1200 

40 GCC ATAACCC TTTTTTACTT CCATTAGGCC GTATAACTGG NGGGACNGCT GGTCGGTATA 1260 

TAATACTGGT WCCAACAMAG GGGTTCTGGA TGTACACMAG GTTATCTT 1308 



(i) SEQUENCE CHARACTERISTICS: 
50 (A) LENGTH: 1705 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : double 
<D) TOPOLOGY: linear 

55 (xi) SEQUENCE DESCRIPTION : SEQ ID NO: 216: 

TGGCCATGGA AGCGCTAGAA GGTTTAGATT TTGAAACAGC AAAGAAGGAT TTCCTTGGAT 60 

CTGGAGACCC CAAAGAAACA AAGATGCTAA TCACCAAACA GGCTGACTGG GCCAGAAATA 12 0 

60 
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55 

(2) INFORMATION FOR SEQ ID NC : 217: 
60 (i) SEQUENCE CHARACTERISTICS: 



TCAAGGA3CC CAAAGCCGCC GTGGAGATGT ACATCTCAGC AGGAGAGCAC GTCAAGGCCA 

TCGAGATC7G TGGTGACCA7 GGCTGGGTTG ACATGTTGAT CGACATCGCC CGCAAACTGG 24 0 

ACAAGGCTGA GCGCGAGCCC CTGCTGCTGT GCGCTACCTA CCTCAAGAAG CTGGACAGCC 3C3 

CTGGCTATGC TGCTGAGACC TACCTGAAGA TGGGTGACCT CAAGTCCCTG GTGCAGCTGC 360 

AGTGGAGACC CAGCGCTGGG ATGAGGCCTT TGCTTTGGGT GAGAAGCATC CTGAGTTTAA 420 

GGATGACATC TACATGCCGT ATGCTCAGTG GCTAGCAGAG AACGATCGCT TTGAGGAAGC 48 0 

CCAGAAAGCG TTCCACAAGG CTGGGCGACA GAGAGAAGCG GTCCAGGTGC TGGAGCAGCT 540 

15 CACAAACAAT GCCGTGGCGG AGAGCAGGTT TAATGATGCT GCCTATTATT ACTGGATGCT 600 

GTCCATGCAG TGCCTCGATA TAGCTCAAGA TCCTGCCCAG AAGGACACAA TGCTTGGCAA 660 

GTTCTACCAC TTCCAGCGTT TGGCAGAGCT GTACCATGGT TACCATGCCA TCCATCGCCA 720 

20 

CACGGAAGAT CCGTTCAGTG TCCATCGTCC TGAAACTCTT TTCAACATCT CCAGGTTCCT 780 
GCTGCACAGC CTGCCCAAGG ACACCCCCTC GGGCATCTCT AAAGTGAAAA TACTCTTCAC 
25 CTTGGCCAAG CAGAGCAAGG CCCTCGGTGC CTACAGGCTG GCCCGGCACG CCTATGACAA 

GCTGCGTGGC CTGTACATCC CTGC C AGATT CCAAAAGTCC ATTGAGCTGG GTACCCTGAC 960 



840 
900 



1020 



CATCCGCGCC AAGCCCTTCC ACGACAGTGA GGAGTTGGTG CCCTTGTGCT ACCGCTGCTC 

CACCAACAAC CCGCTGCTCA ACAACCTGGG CAACGTCTGC ATCAACTGCC GCCAGCCCTT 1080 

CATCTTCTCC GCCTCTTCCT ACGACGTGCT ACACCTGGTT GAGTTCTACC TGGAGGAAGG 1140 

35 GATCACTGAT GAAGAAGCCA TCTCCCTCAT CGACCTGGAG GTGCTGAGAC CCAAGCGGGA 1200 

TGACAGACAG CTAGAGATTT GCAAACAACA GCTCCCAGAT TCTTGCGGCT AGTGGGAGAC 1260 

CAAGGGACTC CATCGGAGAT NAGGACCCGT TCACAGCTAA GCTRAGCTTT GAGCAAGGTG 1320 

GCTCARAGTT CGTGCCAGTG GTGGTGAGCC GGXTTGGTGCT GCGCTCCATG AGCCGCCGGG 1380 

ATGTCCTCAT CAAGCGATGG CCCCCACCCC TGAGGTGGCA ATACTTCCGC TX^CTGCTGC 1440 

45 CTGACGCCTC CATTACCATG TGCCCCTCCT GCTTCCAGAT GTTCCATTCT GAGGACTATG 1500 

AGTTGCTGGT GCTTCAGCAT GGCrcCTCCC CCTACTGCCG CAGGTGCAAG GATGACCCTG 1560 

GCCCATGACC AGCATCCTGG GGACGGCCTG CACCCTCTGC CttXTCTTGGG GTCTGCTGGG 1620 

CTGTGAAGGA GAATAAAGAG TTAAACTGTC AAAAAAAAAA AAAAAAAAAA AAAAAAAAAA 1680 
AAAAAAAAAA AAAAAAAAAA AAANA 
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47(.) 

(Ai LENGTH: 999 case pairs 

(B) TYPE: nucleic acid 

(C) STRAND EDNESS : double 

( D ) TO POLOGY : 1 1 near 

5 

{;<!) SEQUENCE DESCRIPTION : SEQ ID KC : 217: 
AGCAAATCAC CTTAACGATC TGGAATGAAA CTGTGACCAG TGCCGCCCTG GGTO7TTCTG 
10 GAGAGACTGC GGTCTTCTTG TTTGGCCATA GGTGCTGGGG CCCCGGCITC AGTCACTGTC 

TCAGACAGKA GTCCCGATAA GCAGATCACC AGTCCTCCAC TGTCCTTCCT GTCGGCCTTG 18 3 

CTGCATGAGA AGATAGCTGC TTCCTCCCTC TTTTCCTACA CTGTAAATTA TTGTTTTACA 24 3 

ATTGAGTGYC TTAATAATAG TYTACAAATA CTATGTATTT ATGCAAAACT GTTAAAGTTC 300 
TCATCTGTTA TGATTGGATA CTTGGTCTTG TCAGTAGTGG TCAGCATTGG GTTGTGAGCT 360 
20 TGTCCTACTC CATACGTGTT TATCCTGCTA TGCATTTTAC ATTGTGTGTT CACATCTATT 42 0 

r CAAGG AGC C TTGCTAGAAA CAACACTGGC GGTTCCTGCA GGCCAGGCAG GCATTGGCCC 480 
ATGCTGTGTC CCATAGGAGC CAATGGAAAG AACGTAGCTT GGTCTGCTAG CCAGCCGTGG 540 
GGTGGCGCAG GCCAGGCAGC CTCTGCACCA GAGTCCAGCA CCTGCCCATT CCCCAGTCAC 
ACAATCATAC TCTTCTTTCA TAGAGATTTT ATTACCACCT AGACCACCCT AGTTTTCCTC 
30 TCTGTTAGTG TCCTGAGCTC TTTTGCAACA AAATGTAGGT ACAGACCAAT CCCTGTCCCT 72 0 

TCCCCAATCA GGAGCTCCAC ACCATGAGTT GTTTGGTTTT CCAGAAGCTG CCAGTGGGTT 780 
CCCGTGAATT GCGTTAAGAT ATCGATGATK TTTTTTATTG TTTTTCTTCT TGTTTTTTTA 
AATAATATAT TTAAAGGCAG TATCTTTTGT ACTGTGAATT TGCAGTAGAA GATGCAGAAT 
GCACTTTTTT TTTACTTCTG TTGGTGTGTA TTGTATATAG TGTGTGTGCT TCTTGTGATG 960 
40 AAAATAAACT TTTTCTTTAT AAAAAAAAAA AAAAAAAAC 999 
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840 

900 



45 (2) INFORMATION FOR SEQ ID NO: 218: 

(i) SEQUENCE CHARACTERISTICS : 

(A) LENGTH: 941 base pairs 

(B) TYPE: nucleic acid 
50 (C) STRANDEDNESS : double 

(D) TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO : 218: 
55 GGCACGAGTA GCATTTCATT 7AATCTGCAG GTATATTCTC CCAACAGTTT ATTGTCATGT 
GATGTCCTCA GCCAAGATTG TRAGGCAGAG AGGAGCTGTC CCAACCTACT A7ACCACCGA 
GGC-GGAGAG ATCATATTTT TGGTATTAAA CTGGAGTCTC TCCATCCTTC ACATTGTTGA 

60 



60 
120 
180 
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TGTCCTCTGT AGCAAACCGG AAAAGTCAGT GAC AGAAGAT 


gccgctagg:- 


G i TTGAGC CA 


240 




•"i^a^/T^A^A GrTC** , GG'XTT GGAGAAAAGG GCC GGATGGT 


'3GCTCTAGAA 


AGCCCATCCT 


300 


J 


^r-TY-v-nv--T v r^ ^^TTTCTCC r CCTTATATT GTGCTTTCAT 


TCATTCATTC 


ATTCATCAAA 


360 




^ AT^r^T^v-i, fT-ATTA^A TGTGTCAAGC TCTGTGCTAG 


CCTCTGGAAA 


ACCTGCCCTC 


420 


10 


ATGTAGCTCA C TGTGGAGT A GGAGAAAC AA TGACTACACT 


ATGATAAGCA 


CGGGTTGTCA 


483 


^^orTv-^^i^a HarVAPTW^" CCPTCATC^A GACCGATGAG 


GTCAAAGAAG 


GCATCCAGGC 


540 




rfrr\T^TTr Trir^TTia ~TGAAGAATG AGAGGGAGCT 


GCACCASCAG 


GGGTTGGAAC 


600 


15 


TGAAGGTGGC AGTGCCTGGA GTCTTGATTC CAGCAGAGGG 


AGAGCAGTCT 


GTGAAAAGGC 


e c o 
bo J 




ACCAAGGGTG GGAGAGGGCA GAGCACATGG AGGAACTTCA 


GGTAGTTCTG 


GATGGCSCTG 


720 




GGGCAAAGCT AGAGAGGTAA GAAGAATCTA CAAATGTTCC 


TCGAGTTACA 


TGAACTTCCA 


780 


20 








840 




TCC C AAT AAA C CC ATTGGAA ACGAAAAATT TAAGTCAGAA 


GTGCATTTAA 


GGCTGGTCC3 




AGTAGAATGA TTTTTACAAC GAATTGATCA CAACCAGTTA 


CAGATGTCTT 


TGTTCCTTCT 


900 


25 


CCACTCCCAC TGCTTCACCT GACTAGCCTT TAAAAAAAAA 


A 




941 



30 (2) INFORMATION FOR SEQ ID NO : 219: 

(l) SEQUENCE CHARACTERISTICS : 

(A) LENGTH: 575 base pairs 

(B) TYPE: nucleic acid 
35 (C) ST HANDEDNESS : double 

(D) TOPOLOGY: linear 

(Xi) SEQUENCE DESCRIPTION: SEQ ID NO: 219: 
40 TAAGTGGAAT CCCCCGGGGT TGCAGGGAAT TCGGCACGAG GCATTCTGAG AAGCTTAAGA 
C AT ACTTTG A AGACAACCCT AGGGACCTCC AGCTGCTGCG GCATGACCTA CCTTTGCACC 
CCGCAGTGGT GAAGCCCCAC CTGGGCCATG TTCCTGACTA CCTGGTTCCT CCTGCTCTCC 180 
GTGGCCTGGT RCGCCCTCAC AAGAAGCGGA AGAAGCTGTC TTCCTCTTGT AGGAAGGCCA 
AGAGAGCAAA GTCCCAGAAC CCACTGCGCA GCTTCAAGCA CAAAGGAAA.G AAATTCAGAC 
50 CCACAGCCAA GCCCTCCTGA GGTTGTTGGG CCTCTCTGGA GCTGAGCACA TTGTGGAGCA 350 
CAGGCTTACA CCCTTCGTGG ACAGGCGAGG CTCTGGTGCT TACTGCACAG CCTGAACAGA 420 
CAGTTCTGGG GCCGGCAGTG CTGGGCCCTT TAGCTCCTTG ■GCACTTCCAA GCTGGCATCT 480 
TGCCCCTTGA CAACAGAATA AAAATTTTAG CTGCCCCAAA AAAAAAAAAA AAAAAAAAAA 540 
CTCGAGGGGG GGCCCGTACC CAATTCGCCC TATAA 575 

60 



45 



55 



60 
120 



240 
300 
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(2) INFORMATION FOR SEQ ID NO: 220: 

5 (i) SEQUENCE CHARACTERISTICS : 

(A) LENGTH: 3018 base pairs 
(3) TYPE: nucleic acid 
(C) STRANDEDNESS : double 
CO) TOPOLOGY: linear 

10 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 22C: 

GCCAGCCTTA CAGGTTTTAC GTGAAATGAA AGCCATTGGA ATAGAACCCT CGCTTGCAAC 60 

15 ATATCACCAT ATTATTCGCC TGTTTGATCA ACCTGGAGAC CCTTTAAAGA GATCATCCTT 12 0 

CATCATTTAT GATATAATGA ATGAATTAAT GGGAAAGAGA TTTTCTCCAA AGGACCCGGA 180 

TGATGATAAG TTTTTTCAGT CAGCCATGAG CATATGCTCA TCTCTCAGAG ATCTAGAACT 240 

20 

TGCCTACCAA GTACATGGCC TTTTAAAAAC CGGAGACAAC TGGAAATTCA TTGGACCTGA 3 00 

TCAACATCGT AATTTCTATT ATTCCAAGTT CTTCGATTTG ATTTGTCTAA TGGAACAAAT 3 60 

25 TGATGTTACC TTGAAGTGGT ATGAGGACCT GATACCTTCA GCCTACTTTC CCCACTCCCA 42 0 

AACAATGATA CATCTTCTCC AAGCATTGGA TGTGGCCAAT CGGCTAGAAG TGATTCCTAA 4 SO 

AATTTGGGAA AGATAGTAAA GAATATGGTC ATACTTTCCG CAGTGACCTG AGAGAAGAGA 540 

30 

TCCTGATGCT CATGGCAAGG GACAAGCACC CACCAGAGCT TCAGGTGGCA TTTGCTGACT 600 

GTGCTGCTGA TATCAAATCT GCGTATGAAA GCCAACCCAT CAGACAGACT GCTCAGGATT 660 

35 GGCCAGCCAC CTCTCTCAAC TGTATAGCTA TCCTCTTTTT AAGGGCTGGG AGAACTCAGG 720 

AAGCCTGGAA AATGTTGGGG CTTTTCAGGA AGCATAATAA GATTCCTAGA AGTGAGTTGC 780 

TGAATGAGCT TATGGACAGT GCAAAAGTGT CTAACAGCCC TTCCCAGGCC ATTGAAGTAG 840 

40 

TAGAGCTGGC AAGTGCCTTC AGCTTACCTA TTTGTGAGGG CCTCACCCAG AGAGTAATGA 900 

GTGA1TTTGC AATCAACCAG GAACAAAAGG AAGCCCTAAG TAATCTAACT GCATTGACCA 960 

45 GTGACAGTGA TACTGACAGC AGCAGTGACA GCGACAGTGA CACCAGTGAA GGCAAATGAA 1020 

AGTGGAGATT CAGGAGCAGC AATGGTCTCA CCATAGCTGC TGGAATCACA CCTGAGAACT 1080 

GAGATATACC AATATTTAAC ATTGTTACAA AGAAGAAAAG ATACAGATTT GGTGAATTTG 1140 

50 

TTACTGTGAG GTACAGTCAG TACACAGCTG ACTTATGTAG ATTTAAGCTG CTAATATGCT 1200 

ACTTAACCAT CTATTAATGC ACCATTAAAG GCTTAGCATT TAAGTAGCAA CATTGCGGTT 1260 

55 TTCAGACACA TGGTGAGGTC CATGGCTCTT GTCATCAGGA TAAGCCTGCA CACCTAGAGT 132 0 

GTCGGTGAGC TGACCTCACG ATGCTGTCCT CGTGCGATTG CCCTCTCCTG CTGCTGGACT 138C 

TCTGCCTTTG TTGGCCTGAT GTGCTGCTGT GATGCTGGTC CTTCATCTTA GGTGTTCATG 1440 

60 
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C AGTTCTAA G ACAGTTGGCG TTGGGTCAAT AGTTTCCCAA TTTCAGGATA TTTCGATGTC 1500 

AGAAAT AAC 3 CATCTTAGGA ATGACTAAAC AAGATAATGG CAGTTTAGGC TGCACAACTG 1C60 

5 GTAAAATGAG TGTAGATAAA TGTTGTAA7T AGTGTACACG TTTGTATTTT TGTTAATATA 1620 

GCCGCTGCCA TAGTTTTCTA ACTTGAACAG C3ATGAATGT TTCATGTCTC CCTTTTTTTT 168C 

TTGTCT AT AG CTGTTACCTA TTTTAGTGGT T'GAAATGAGA GCTAGTGATG ACAGAAGGAT 174 C 

10 

GTGGAATGTC TTCTTGACAT CATTGTGTAT T3CTGGTAAT CAAGTTGGTA ACGACTACTT 1800 

CTAGCAGCTC TTACCACTAT GACTTAAGTG GTCCPGGAA 3 GCAGTAAGTG GAC-GTTTGCA 1860 

15 GCATTCCTGC CTT CATGAGG GCTTCTAC C A CTGACCACTT TGCACGTACC TGGCTCCCAG 1920 

ATTTACTTAG GTACCCCACG AGTCGTCCAC ATAAGCAGCT TCATCTTTAC CTTGCCAGAG 198 0 

TTGACAATTA TGGGATACTC TAGTCTACTT ATACTTGTGT TCCCATCTGT CF3CCATCCT 2040 

20 

CTGAAGGCCA GGACCCAGTC ATACATCCTT AGAAACCAAA GTATGGTTTT TGTTTTCTCT 2100 

TGGAATGTCA GGTCTTAAGG CATTTAATTG AGGGACAAAA AAAAAAAAAA GCCGATATAG 2 2.60 

25 TAGCTAGCTA CTTAAGCATC CATGGGTATT GCTCCATATC AAAGCAGATT TGCAGGACAG 2220 

AAAGAGTAAA TTAGCCTTCA GTCTTGGTTT ACAGCTTC 2A AGGAGAGCCT TGGSCACCTG 2280 

AAATGTTAAC TCGGTCCCTT CCTGTCTCTA GTTCATCAGC ACCTGCAGAT GCCTGACTCT 234 0 

30 

TGTTAGCCTT ACTATTCAAT ACAGTCCTTA GATTCACGGT ATGCCTCTTC CTATCCAGGC 2400 

ACCTATTCTG AATCACCATG TTGCTCTGCA GCTAGAGTTG ATAGGAGAAA ATCCATTTGG 2460 

35 GTAGATGGCC TATGAATTTG TAGTAGACTT TC AAAATG AG TGATTTGTTA GCTTGGTACT 2520 

TTTAAGTTTG TGGTACAGAT CCTCCAAACC CATACTCTGA GCAATTAACT GCCTTGAACA 2580 

TAGAGAAAAA TTAAGGCCTC ACAGGATGAG TCTCCATTCT CTGTAAATGC TTATTTTATC 2640 

40 

ATAGTCTTTA GCCTCTAACT ATGAGTAAAA TGTTCTCTTC GGCCGGGTGT GGTGACTCAC 2700 

ACCTGTAACC TCAGCACTTT GGGAGGCAGA GGTGGGAGGA TCACTTAGGT CCAGGAGTTC 2760 

45 GAGACTAGCC TGGGCAACAT AGTGAGACAC CGGATCTACA AAAAAATAAA AAGCCAGACT 2820 

GGTGGTATGT ATCTGTGTCC CAGCTAATTG GGAGGGTGAG ATGGGAGGAT TGTTTGAGCC 2880 

TAGGAGAGGG AGGTTGCAGT GAGCCGTGAT CGCACCACTG CACTCCAGCC TGGGCAACAG 2940 

50 

AGCAAGACCC TGTCTTGGAG AAACGAGAAT TTTGGAAGAG CAAATGGGGC TGAGTGCAGT 3000 

GGCTCATGCC TGTAATCC 3018 

55 

(2) INFORMATION FOR SEQ ID NO: 221: 
60 (i) SEQUENCE CHARACTERISTICS: 
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474 



(A) LENGTH : 96 3 base pairs 

(3) TYPE: nuclei." acid 

(C) ST HANDEDNESS : double 

(D) TOPOLOGY : linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 221: 
3GCACGAGGG CCGCGGGACA TCCACGGGGC GCGAGTGACA CGCGGGAGGG AGAGCAGTGT 
TCTGCTGGAG CCGATGCCAA AAACCATGCA TTTCTTATTC AGATTCATTG TTTTCTTTTA 
TCTGTGGGGC CTTTTTACTG CTCAGAGACA AAAGAAAGAG GAGAGCAC 03 AAGAAGTGAA 
AATAGAAGTT TTGCATCGTC CAGAAAACTG CTCTAAGACA AGCAAGAAGG GAG AC CTACT 
NAAATGCCCA TTATGACGGC TACCTGGCTA AAGACGGCTC GAAATTCTAC TGCAGCCGGA 
CACAAAATGA AGGCCACCCC AAATGGTTTG TTCTTGGTGT TGGGCAAGTC AT AAAAGGCC 
TAGACATTGC TATGACAGAT ATGTGC CCTG GAGAAAAGCG AAAAGTAGTT ATACCCCCTT 
C ATTTGCAT A CGGAAAGGAA GGCTATGCAG AAGGCAAGAT TCCACCGGAT GCTACATTGA 
TTTTTGAGAT TGAACTTTAT GCTGTG AC C A AAGGACC AC G GAGCATTGAG ACATTTAAAC 
AAATAGACAT GGACAATGAC AGGCAGCTCT CTAAAGCCGA GATAAACCTC TACTTGCAAA 
GGGAATTTGA AAAAGATGAG AAGCCACGTG ACAAGTCATA TCAGGATGCA GTTTTAGAAG 
ATATTTTTAA GAAGAATGAC CATGATGGTG ATGGCTTCAT TTCTCCCAAG GAATACAATG 
TATACCAACA CGATGAACTA TAGCATATTT GTATTTCTAC TTTTTTTTTT TA3CTATTTA 
CTGTACTTTA TGTATWAAAC AAAGTCMCTT TTCTCCMAGT TGKATTTGCT ATTTTTCCCC 
TATGAGAAGA TATTTTGATC TCCCCAATAC ATTGATTTTG GTATAATAAA TGTGAGGCTG 
TTTTGCAAAC TTAAAAAAAA ATTTAAAAAA ACTGGAGGGG GGCCCGTACC CAANTCGCCG 
NATATGAT 

(2) INFORMATION FOR SEQ ID NO: 222: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 1404 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : double 

(D) TOPOLOGY, linear 

(xi) SEQUENCE DESCRIPTION : SEQ ID NO: 222: 
CGTTTTCCGG CCGTGCGTTT GTGGCCGTCC GGCCTCCCTG ACATGCAGCC CTCTGGACCC 
CGAGGTTGGA CCCTACTGTG ACACACCTAC CATGCGGACA CTCTTCAACC TCCTCTGGCT 
TGCCCTGGCC TGCAGCCCTG TTCACACTAC CCTGTCAAAG TCAGATGCCA AAAAAGCCGC 
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3TCAAAGACG CTGCTGGAGA AGAG7CAGTT TTCAGATAAG CCGGTGCAAG ACCG3GGTTT 24 0 

3-3TGGTGACG GArCTCAAAG CTGAGAGTGT 3G TTCTTGAG CAT2GCAGCT ACT G3*rCGGC jCO 

5 AAAGGCCCGG GACAGACACT TTGCTGGGGA TGTACTGGGC TATGT GACTC CATG3AACAG -,60 

:CAT03CTAC GATGTCACCA AGCOTTTTGG GAGCAAGTTC ACACA3ATCT CACCCGTCTG 420 

! jCTGCAGCT G AAGAGACGTG GCCGTGAGAT GTTTGAGGTC ACGjGCCTCC ACGACGTGGA 48 0 

10 

CCAAG3CTG3 ATGCGAGCTG TCAGGAAGCA TG-:CAAGGO_ CTG'JACATAG TGC CTCGGCT r ;4 0 

CrTGTTTGAG GACTGGACTT ACGATGATTT CCGGAACGTC TTAGACAGTG AG3ATGAGAT 600 

15 AGAG3AC«CTG AGGAAGACCG TGGTCCA3GT 3G2AAAGAAC CAGZATTTCG ATG3CTTCGT 660 

GGTGGAGGTC TGGAACCAGC TGCTAAG2CA GAAGCGOGTG GG02TCATCC AC ATGCTC AG 720 

CCACTTGGCC GAGGCTCTGC ACCAGGCCCG GCTGCTGGCC CTCCT3GTCA TCCCGCCTGC 780 

20 

CATCACCCC Z GGGACCGACC AGCTGGG2AT GTTCACGCAC AAGGA3TTTG AG2AGCTGQ:: 840 

CCC03TGCTG GATGGTTTCA GCCTCATGAC CTACGACTAC TCTACAGCGC ATCAGCCTGG 900 

25 CCCTAATGCA CCCCTGTCCT GGGTTCGAGC CTGCGTCCAG GTCCT 3GACC CGAAGTCCAA 960 

GTGGCGAAGC AAAATCCTCC T'GGGGCTCAA CTTCTATGGT ATGGAGTACG CGACCTCCAA 1020 

GGATG2CCGT GAGCCTGTTG TGGGGGCCAG GTACATCCAG ACACT3AAGG ACC AC AGGCO 1080 

30 

CCGGATGGTG TG3GACAGCC AGGYCTCAGA CCACTTCTTC GAGTACAAGA AGAGCCGCAG 1140 

TGGGAGGCAC GTCGTCTTCT ACCCAACCCT GAAGTCCCTG CAGGTGCGGC TGGAGCTGG: 1200 

35 CCGGGAGCTG GGCGTTGGGG TCTCTATCTG GGAGCTGGCC AGGGCCTGGA CTACTTCTAC 1260 

GACCTGCTCT AGGTGGGCAT TGCGGCCTCC GCGGTGGACG TGTTCTTTTC TAAGCCATGG 1320 

AGTGAGTGAG CAGGTGTGAA AT ACAGGC CT NCACTCCGTT TGCTGTGAAA AAAAAAAAAA 1380 

40 

AAAAAAAAAA AAAAAAAAAA AAAA 1404 



45 

(2) INFORMATION FOR SEQ ID NO: 223: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 707 base pairs 
50 (B) TYPE: nucleic acid 

(C) STRANDEDNESS : double 

(D) TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 223: 

55 

NGCGCGCCTG CAGTCGACAC TAGTGGATCC AAAGAATTCG GCACGAGGGC AGGTCGAGGG 60 
CTCAGAAATC AGCTCTATTG ACGAATTCTG CCGCAAGTTC CGCCTGGACT GCCCGCTGGC 120 
60 CATGGAGCGG ATCAAGGAGG ACCGGCCCAT CACCATCAAG GACGAC AAGG GCAACCTCAA 180 
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CCGCTGCATC GCAGACGTGG TCTCGC7C77 IA7GAIGG7C 7GI7C7TGGA 24 0 

GATCCGC3CC ATGGATGAGA TCCAGCC7G-. IC7GCG-GAG C77A7GGA7A CGA7GCACC3 3 0 C 

5 

catgagciac ctcccacccg actttgaggg ccgccagacg giiagciatt gggtgiaga: 3 6C 

CCTGAGC3GC ATGTCGGCGT CAGATGAGC7 GGAC3AZTCA. CAGGTC-IG7C AGA7G77GTT 42 0 

10 CGACCTGGAG TCAGC CTACA ACGCCTTCAA 7GGCT7CCTG 1-7GICTGAG CC7GGGGCA2 430 

TAGC7CTTGC ACAGAAGGGC AGAGTC7GAG GCGATG3CTC C7GGTCG77T GTCCGICACA 54 0 

CAGGCCGTGG TCATCCACAC AACTGA.GTGT 7TGCAG7TG7 TIG 7I7GG7G TCGGTOTTTO 60 0 

15 

GTGTCAGAAC TTTTGGGCCG GGCCCCTGCC CACAA7 AAAG A7GCTC777G AC77TCAAAA 660 

AAAAAAAAAA AAAAACTCRG GGGGC-GCCCG GIG ICAATCC CCCGJCZ; 707 

20 

(2) INFORMATION FOR SZQ ID NO: 224: 

25 (i) SEQUENCE C-IARAGTZPJTSTICE : 

(A) LENGTH: 1334 case pairs 

(B) TYPE: nucleic ac:i 

(C) STPANDZGNZSS : double 

( D ) TOPOLOGY : 1 ir.efLT 

30 

(XI) SEQUETJC E DESCRIPTION: EZQ ID NO: 224: 
GGGGAACTGC AGTGACAGCA GGAGTAAGA7- 7GGGAGGCAG GA 7 AGAG77G GGACACAGGT 60 
35 ATGGAGAG3G GGTTCAGCGA GCC7AGAGAI GG7AGA7TA7 7AG-GG77-77G GC GG7GAGAA 120 
TCCAGGGAGA GGAGCGGAAA CAGAAGAGG3 3CAGAAGACC GGG3CAG7TG TGGG7TGCAG 130 
AGCCCCTCAG CCATGTTGGG AGO ZAAGCCA GAGTGGCTAC CAGG7C777T AxTAGAGTCCI 240 

40 

GGGCTGCCCT TGGTTCTGGT GCT7CTGGC7 77GGGG3CCG GGTGGG77CA GGAGGGGTCA 300 
GAGCCCGTCC TGCTGGAGGG GGA3TGCCTG GIGGTG7GTG AC-77TGG7CG AG77GCTGCA 360 
45 GGGGGGCCCG GGGGAGCAGC CCTGGGAGAG GCA7CCIC7G GGCGAG7GGC A7T7GCTGCG 420 
GTCCGAAGC7 AMCACCATGA GCCAGCAGGG GAAACCGGCA A73GCA77AK TC-GGGCCATC 480 
TACTTCGACC AGGTCCTGGT GAAIGAGGGC GGTGGCTTTG ACCGGG7GTC TGGCTCCTTC 540 

50 

GTAGCCCCTG TCCGGGGTGT CTAGAGCTTC 7GGTT7IA7G TGGTGAAGGT GTACAACCGC 600 
CAAACTGTCC AGGTGAGCCT GA7GCTGAAC ACGTGGIC7G TCATC7GAGC C777GCCAAT 660 
55 GATCCTGACG TGACC7GGGA GGCAGCCAC7 AGCTCTGTGC TACTGCGGT7 GGAC77TGGG 720 
GACCGAGT3T CTCTGCGIGT GCG7CGGGGI AATCTACTGG G7GGT7GGAA A7A7TCAAG7 780 
TTCTCTGGCT TCCTCATCTT CCC7CTCTGA GGACCCAAGT YTTTCAAGCA C1AAGAATCCA 840 

60 
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gcccctgaca actttcttct gccctc7ctt 
ctxrcctctgg ytcctatccc acytctttgc 
agaraarar y ararctgwgg caggtataca 
taaccatgca tcytcttgct tggccacctc 
ttagtccctc camactctga ctgctgcctc 
tcactgtac : tgttccagca tatccccact 
attctcct-:c 7taggcttcc ta7tacctgg 
cctgccagta tgctaaaccc tccctctctc 
ctggatgaat ctatcaataa aacaactaga 

TCGA 



ATGGGAMCCT G'r jCCAAACA CCCAAGTTTA 

GA3CTGGAAG TGGACCATGG AAAACAT5GA 

CTGAAACTGT CCACCTTT3A AGTTTGAACT 

CTTCCTCCCA GC7CTCTCAC 7GAGTTATYT 

ATCTCTCTTT CTCCTGATCT GTGCTGTCTT 

GATTCCATGA TTCATTCC7T CAGACCCTCT 

TTTCTTATCC CGCTGTCCrA TTGGCCCAGC 

GAATGGTGGT CAAAAAAAAA AAAAAAAAAC 



(2) INFORMATION FOR SEQ ID NO: 225: 

(l) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 7 60 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNES3 : double 

(D) TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 225: 
GGGTCGACCC ACGCGTCCGC TGACCAGTCC GTTATAGATA CTTCTTCCTA TACCAAAACT 
GTTTAAACAG GTGCCACCAC AAGGGATGTC GTCCTTACTC TCTGCGGGTC TTCAAGCATC 
CCTTTGTGGG AAARGTCTCT GGGCAAGCAC GTGGTATTTG GTCTGCTGCT TGCTTCCCTT 
TTTCCACCAG GGATGTTGTG ATCATAAGTC AAAACAACAG TATATTCCAA ATCTCAAAAG 
CTATTGTGGC CTGAGCACAA TTGAAATCTA GCAGAGTTTT TCCTATGTAG CTTTAGAGTA 
ACTCTTCTGC TTCTCTGTCA CTTACAATTC AGGTTCTGCC TTTGCCTAAG AGCATGAGCA 
GAAGAGTCCT CATGTGACGC TTAGTTCTAT TGCAGTCCTG GGTGAAACTA TTTAAGCWAT 
GGGGCTGCTK CTCCCCANWT CCTCCCTAAC AATTCGTTGT GTGGACTTCT CATCTAAAAG 
GTTAGTGGCT TTTGCTTGGG ATCAGTGCTC TCTATTGATG TTCTTGCTGG TCTCCAGACA 
CATTCCTGTT GCATTAAGAC TTGAAAGACT TGTAGATGTG TGATGTTCAG GCACAGGATG 
CTGAAAGCTA TGTTACTATT CTTAGTTTGT AAATTGTCCT TTTGATACCA TCATCTTGTT 
TTCTTTTTGT AGGTATAAAT AAAAACACTG TTGACAATAA AAAAAAAAAA AAAAAAAAAA 
AAAAAAAAAA AAAAAAAAAA NAAAAAAAAA AAAAAAAAAA 
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INFORMATION FOR SEQ ID NC : 226: 

(i) SEQUENCE CHARACTERISTICS : 

(A) LENGTH: 2C57 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : double 

(D) TOPOLOGY: linear 



(Xi) SEQUENCE DESCRIPTION: SEQ ID NO: 226: 

CCGAGCCGGC TGC3CCGGGG GAATCCGTGC GGGCGCCTTC CGTCCCR3TC CCATCCTCGC 60 

15 CGCGCTCCAG C AC 3TCTGAA GTTTTGCAGC GCCCAGAAAG GAGGCGAGGA AGGAGGGAGT 120 

GTGTGAGAGG AGGGAGCAAA AAGCTCACCC TAAAACATTT ATTTCAAG3A GAAAAGAAAA 180 

AGGGGGGGCG CAAAAATGGC TGGGGCAATT AT AGAAAAC A TGAGCACCAA GAAGCTGTGC 240 

20 

ATTGTTGGTG GGATTCTGCT CGTGTTCCAA ATCATCGCCT TTCTGGTGGG AGGCTTGATT 300 

GCTCCAGGGC CCACAACGGC AGTGTCCTAC ATGTCGGTGA AATGTGTGGA TGCCCGTAAG 360 

25 AACCATCACA AGACAAAATG GTTCGTGCCT TGGGGACCCA ATCATTGT3A CAAGATCCGA 420 

GACATTGAAG AGGCAATTCC AAGG3AAATT GAAGCCAATG ACATCGTGTT TTCTGTTCAC 480 

ATTCCCCTCC CCCACATGGA GATGAGTCCT TGGTTCCAAT TCATGMTGTT TATCCTGCAG 540 

30 

CTGGACATTG CCTTCAAGCT AAACAACCAA ATCAGRGAAA ATGCAGAAGT CTCCATGGAC 600 

GTTTCCCTGG CTTACCGTGA TGACGCGTTT GCTGAGTGGA CTGAAATGGC CCATGAAAGA 660 

35 GTACCACGGA AACTCAAATG CACCTTCACA TCTCCCAAGA CTCCAGAGCA TGGAGGGCCG 72 0 

GTTACTATGA ATGTGATGTC CTTCCTTTCA TGGAAATTGG GTCTGTGGCC CATGAAGTTT 780 

TACCTTTTAA ACATCCGGCT GCCTGTGAAT GAGAAGAAGA AAATCAATGT GGGAATTGGG 840 

40 

GAGATAAAGG ATATCCGGTT GGTGGGGATC CACCAAAATG GAGGCTTCAC CAAGGTGTGG 900 

TTTGCCATGA AGACCTTCCT TACGCCCAGC ATCTTCATCA TTATGGTGTG GTATTGGAGG 960 

45 AGGATCACCA TGATGTCCCG ACCCCCAGTG CTTCTGGAAA AAGTCATCTT TGCCCTTGGG 1020 

ATTTCCATGA CCTTTATCAA TATCCCAGTG GAATGGTTTT CCATCGGGTT TGACTGGACC 1080 

TGGATGCTGC TGTTTGGTGA CATCCGACAG GCATCTTCTA TGCRATGCTT CTKTCCTTCT 1140 

50 

GGATCATCTT CTGTGGCGAG CACATGATGG ATCAGCACGA GCGGAACCAC ATCGCAGGGT 1200 

ATTGGAAGCA AGTCGGACCC ATTGCCGTTG GTCCTTCTGC CTCTTCATAT TTGACATGTG 1260 

55 TGAGAGAGGG GTACAACTCA CGAATCCCTT CTACAGTATC TGGACTACAG ACATTGGGAA 1320 

CAGAGCTGGC CATGGCTTTC ATCATCGTGG CTGGAATCTG CCTCTGCCTC TAACTTCCTG 1380 

TTTCTATGCT TCATGGTATT TCAGGTGTTT CGGAACATCA GTGGGAA3CA GTCCAGCCTG 1440 

60 



10 



20 
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GCAG-rrATGA ggaaa:gg7G :-:ggcta:a7 ?a7gaggggg taatttttag gttcaagttc 150c 

ctcagc-ggta TC-.rrrTCOC 77gggg7GG2 aggagggtga tcttcttcat cgttagtcag i s 6 c 

GTAAGG-GAAG GC7A.T7GG-GA A-.7GGGGG7G 7G7G-.2AG77 77AAGTGAAC AGTGCCTTTT 1620 

TC-.GAG-GCA7 TTTAGGGGA I Z ."GGA-AGGGGG A7G7G7777GG 7GTGATGTTG TTGTATGCAC 1680 

CA7CCCATAA AAAGG A7GGA XAAGA.G'GAG-7 GC-ATG-G.---.G GCAACTCCCA TGTAAATCGA 1740 

GGG.A-jG.A77G 7GC7*77GT777 G7G7C3GA.AG 7777 A7G AA_ GA ATTGTTCAGC GGTTCGAAAT 1800 

AT7G 777TGA 7 GAA7GAGAAG G7AGC77TCG G GGA777G-A77 GAACAAGGGA AGACATGTTT 18*j0 

15 ATCAGG7T7G 7AG777GCA,37: 7 777AGAG7G AGA77GA7GG 7AGTTGTATA 02CAGAGAAA 192 0 

TACAG7GAT7 TAGCCTTGAG 77 7AAAA7G7 7AAA7A.7AA.G GAAAAAAGCG TGAAGAATAA 1980 

ATATTCTTTG AGGTATTGTCT 7AGTOCTG77 , AAAAAAAA.A-. AAAAAAACTC GTGCGGAATT 204 0 



20S7 



25 

(2; 7:r?G?:-7A77DN TZr. SZ; 17 MC: 227: 

(i. ~£VH::cx G>iA-AG77£?.:rrrcs : 

(A; LZTJG77-;: 2054 case cairs 

30 C3j TYPE: r.ucle:: acid 

(7) S77?A:T:E77>3SS: dcurle 

(G) 7GPG17GY : 1— ear 

(zi) £7£;gz:;gz ieg7?G7g:::;: si; 7g >;o: 227: 

35 

GGCAG.AGGG-2 GAG777GGTGC AAAGA.GGGAA AGG2GGAT7 2 G7GTGTGCC 2 CTCCTCTCGC 60 

ACCAAGGC-C7 7G7A7AAAAA7 AGC7CTTG7T A.GGGGAAA.7A ACTGTTCATT TTTCACTCCT 12 0 

40 CCGGGGTAC-G 7CACAGTTTT CA 3AAAAAGA ATG7GCA.7G 2 TGGAAACCAG AAGAAAAATA 180 

TGAGACGGGG- AA-7GA.7GGTG 7GA7GTG7GT SC7GGG7777 j G7TGAGTGTG TGGAGTCCTG 240 

CTCAGGrrGTr" A73^ACAG72G 7GTTTGA7GG 7G-G7GGG777G AGGGGAACCG CTTGTTCAGA 300 

45 

GC7G7GA.G7G GGGC7GCAGT C-GAGAGAAGG TGG7GT7GGG 7GGTGGTAG2 GCGGGGCCTT 360 

CTCTGGTGG7 GA7GA. 7 G G_ AG AGCAGGGAGT GTCCGGGAGG CAGAAGGTAG CGGGGCAGCT 420 

50 ACTGGAGGAG 7GTGCGGGCG 7GCG72GGG7T GCCCCCZCZQ GGGTGGGGG7 CTGTTGCTGG 480 

TGGCCATCTA 77TC7AGTA7 77 CCTCCGAA A7G2GG7CGG CGCGCCCTTG AGTTGGATGC 540 

TTGCCC7CC7 GGGCGTTGTG GGAGGCACGG AACATCCTCC TSGGCCTCAA GGGCCTGGGG 60 D 

55 

CC-.GCT^GAGA 7GG?GTGCAGT G7GTGAAAAA GGGAATGTCA ACGTGGCCGA TGGGCTGGCA 66 j 

TGGTCATA77 AGA7GGGAGTA. 7GTGGGGGGG ATCGTGCCAG AGCTGCAGOC GGGGATTGGA 720 

60 ACTTACAATG A3CAT7ACAA GAAGG7TGG7A CGGGGTGCAG TGAGCCAGCG GTGTNATATT 780 
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CTCCTCCCAT TGGACTGTGG GGT 5CCTGAT AACCTGAGTA TGGCTGACCC CAACATTCGC 84 0 

TTCCTGGATA AACTGCCCCA GCAGACCGGT GACCGTGCTG GCATCAAGGA TCGGGTTTAC 930 

5 

AGCAACAGCA TGTATGAGCT TCTGGAGAAC GGGCAGCGGG CGGGCACCTG TGTCCTGGAG 960 

TACGCCACCC CCTTGCAGAC TTTGTTTGGG ATGTCACAAT ACAGTCAAGC TGGCTTTAGC 1020 

10 GGGGAGGATA GGCTTGAGCA GGCCAAACTC TTCTGCCGGA CACTTGAGGA CATCCTGGCA 1080 



GATGCCCCTG AGTCTCAGAA CAACTGCCGC CTCATTGCCT ACCAGGAACC TGCAGATGAC 1140 



AGCAGCTTCT CGCTGTCCCA GGAGGTTCTC CGGCACCTGC GGCAGGAGGA AAA-3GAAGAG 1200 

15 

GTTAGTGTGG GCAGCTTGAA G AC CTCAGCG GTGCCCAGTA CCTCCACGAT GTCC3AAGAG 1260 



CCTGAGCTCC TCATCAGTGG AATGGAAAAG CCCCTCCCTC TCCGCACGGA TTTGTCTTGA 1320 
20 GACCCAGGGT CACCAGGCCA GAGCCTCCAG TGGTCTCCAA GCCTCTGGAC TGGGGGCTCT 1380 
CTTCAGTGGC TGAATGTCCA GCAGAGCTAT TTCCTTCCAC AGGGGGCCTT GCAGGGAAGG 1440 



GTCCAGGACT TGACATCTTA AGATGCGTCT TGTCCCCTTG GGCCAGTCAT TTCCCCTCTC 1500 

25 

TGAGCCTCGG TGTCTTCAAC CTGTGAAATG GGATCATAAT CACTGCCTTA CCTCCCTCAC 1560 



GGTTGTTGTG AGGACTGAGT GTGTGGAAGT TTTTCATAAA CTTTGGATGC TAGTGTACTT 1620 
30 AGGGGGTGTG CCAGGTGTCT TTCATGGGGC CTTCCAGACC CACTCCCCAC CCTTCTCCCC 1680 



TTCCTTTGCC CGGGGACGCC GAACTCTCTC AATGGTATCA ACAGGCTCCT TCGCCCTCTG 1740 

GCTC CTGGTC ATGTTCCATT ATTGGGGAGC CCCAGCAGAA GAATGGAGAG GAGGAGGAGG 1800 

35 

CTGAGTTTGG GGTATTGAAT CCCCCGGCTC CCACCCTGCA GC ATCAAGGT TGCTATGGAC 1860 



TCTCCTGCCG GGCAACTCTT GCGTAATCAT GACTATCTCT AGGATTCTGG CACCACTTCC 1920 
40 TTCCCTGGCC CCTTAAGCCT AGCTGTGTAT CGGCACCCCC ACCCCACTAG AGTACTCCCT 1980 



CTCACTTGCG GTTTCCTTAT ACTCCACCCC TTTCTCAACG GTCCTTTTTT AAAGCACATC 2040 



TCAGATTAAA AAAAAAAAAA AAAAAAAAAA AGGGGGGGCN GCNT 2084 

45 



(2) INFORMATION FOR SEQ ID NO: 228: 

50 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 2143 base pairs 
(3) TYPE: nucleic acid 
(C) STRANDEDNESS : double 
55 (D) TOPOLOGY: linear 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 228: 
TCGACCCACG CGTCCGGTTG AATTCCTTGA CCTGCAAACA CATATTTATT AGCCTGACTC 6C 

60 
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AAACAATOAA GCTATTAAAA CTTCGGAGGA 
TCACCAACAC GCTTATTTTG GCAGTGGCAG 
5 TGAAGTTCAG AATAGTGACA TGTCAGTCGG 
TCTGGCGCTT GCTGTTCTCC ATGATCCTCT 
CAAACAACCA GAGGTTTGCC TTTTCACCAT 

10 

A3GAXCTAT GCTGAAAGAA AGCTTTGAAG 
CCAATGGAAA TAGTAAAGTT AACAAAGCAC 
15 ATGTTCCTTC TTCTGTGACA GATGTAGCAG 
GAATGATCAC ACACTTTGAA AGGTCCAAAA 
AAGATGGCTA CCATCAGGGA AGAGATCAGC 

20 

GGATTAAAGG AAGCAATGAC ATCCTGATCT 
GAGAGGTGTC AGAACAAAGA GAACATCTTA 
25 CTACGAGCTT CTTATTTACA ACACTGCTGC 
TTCATGCAAC TTAAGTGTGT TGTTCCTGAA 
ACAAACTAAA AAGTTTAACG TCTTCTAAAA 

30 

CCAGGAGCAA CTGCCTGTAA TTTTTATTTA 
TTGTTAACTA CCTTTCATTT TCCTGGGAAG 
35 AGAAAAAAGG GCCCTTCTGA GTTAAGGAGC 
AAAAAAAAGA GAAACTGTTA CAGTATGATT 
AATTTTGTTT ACAAATGGTG TATATTAAAG 

40 

AAATATTAGC TTAACTCTTT TGACATCTGC 
GGTGCACACT CCGAAACTTT TAACTACTGT 
45 GTCCTTAGGC AATGTTTTGT TTGCCTTTAT 
GCACCGTGCT AGAGGAACTG TAATGCTTCA 
CCTGCTGGCT TAATTTAAAC AGTTATTGCA 

50 

TCGTTCTTTA GGATGGACTG TTCTGGTATC 
ACATCACAAG GTGATGGGAT TCATTTGAAG 
55 AA7TTTGCCT TCCCAAGATT TTTGTTCTAC 
AAAAATTTAA CAAAATTAAT GTATTTTTCT 
TTCTGTCAAA CTCATGAAAA ATTTCTTTCT 

60 



481 



ACATTGTAAA ACTCTCTTTG TATCGGCATT 12 0 

CATCCATTGT GTTTATCATC TGGACAACCA 180 

ACTGGCGGGA GCTGTGGGTA GACGATGCCA 240 

TTGTCATCAT GGTTCTCTGG CGA 2CATCTG 300 

TGTCTGAGGA AGAGGAGGAG GAP 3AACAAA 36 0 

GAATGAAAAT GAGAAGTACC AAACAAGAAC 42 0 

AGGAAGATGA TTTGAAGTGG GTAGAAGAGA 480 

TTCCAGCCCT TCTGGATTCA GATGAGGAAC 54 0 

TGGAGTAAGG AATGGGAAGA TTTGCAGTTA 600 

ATCTGTGTCA GTCTTCTGTA CGGCTCCATG 660 

GTTCCTTGAT CTTTGGGCAT TGGAGTTGGC 720 

CTGAAAACAA GTTCATAAGA TGA3AAAAAT 780 

CCCCTTTCCT CCCAGACTCT GACATGGATG 84 0 

CTTTCTGTAA TGTTTCATTT TTTAAATCTG 900 

GATTGTCATC AACACCATAA TATGTAATCT 960 

TTTAGGGAGT TACATAGGTG ATGGGGGAAA 1C20 

TCAAGGTTAC ATCTTGCAGA GGTTGTTTTG 1080 

CATAGTTCTA TCAATGATCA AAAGAAAAAA 1140 

CAGATCATTT AAAAAAGCAA AATCAAGTGC 1200 

ATTTTTCTAT TTCAGATGTA CTTTAAAGAG 1260 

TATTGTGACA CATCCCATTG CTGGCAATGT 1320 

TTTGTAAGCC TCCAAGGGTG GCATTGCAGG 1380 

GCAGAGAGGT GCTCCAAGTG CTGTGATTGA 1440 

GAAGTTGTAG CTTATACAAA GGAAACAGGT 1500 

TGAAGTAGCG TGGAGGCCCT GGACTGCTGC 1560 

TGGTATTGGT TTAGAGACTG TTAATAAGGG 1620 

CACTCTATTT CTGTTTTAAT GGTTTTATCC 1680 

ATAAAAAGTT CATGCCACTT TTTAATATAA 1740 

CA TTTT T T TC AAA CTTrr TC TAAAGACTCT 1800 

ATGGCTTTTA TTCTAGATTG TCTTATTTTC 1850 
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10 



15 



25 



55 



4s: 



TGTTAAAACC AATGACCACA TGACCACAAT CTTCACTAAC TCATACTGCA GTGAAAGTGT 1920 

TAACCCTTAG GTAGTTTCTC TACAACTCTT TGCTATGGTG ATTTTTAAAA AAGTTTCCTA 198 C 

GGGAAGTATC TCTGAGGGAA CAGGCAATCT GAAGGAACTG ACTATATTCT CCATGGCTAA 2 040 

GTCCATTAGG CCAAAAGNCT GGGTGGGTAT TGGTTGTCAN GCTGTCTATT GGC AT ATT AA 2100 

AAACGTAGGC CGGANGGAAT AATTAGGTTG TNATGCCGGC GGG 2143 



(2) INFORMATION FOR SEQ ID NO: 229: 



(l) SEQUENCE CHARACTERISTICS : 

(A) LENGTH: 1025 base pairs 

(B) TYPE: nucleic acid 
<C) STRANDEDNESS : double 

20 (DJ TOPOLOGY: linear 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 229: 

CCTGGCCCAC ATTGCTTCAT TGGCCTGGCC ATGCGCCTGT ACTATGGCAG CCGCTAGTCC 60 

CTGACAACTT CCACCCTGAT TCCGGACCCT GTAGATTGGG CGCCACCACC AGATCCCCCT 120 

CCCAGGCCTT CCTCCCTCTC CCATCAGCAG CCCTGTAACA AGTGCCTTGT GAGAAAAGCT 180 

30 GGAGAAGTGA GGGCAGCCAG GTTATTCTCT GGAGGTTGGT GGATGAAGGG GTACCCTAGG 24 0 

AGATGTGAAG TGTGGGTTTG GTTAAGGAAA TGCTTACCAT CCCCCACCCC CAACCAAGTT 3 00 

CTTCCAGACT AAAGAATTAA GGTAACATCA ATACCTAGGC CTGAGAAATA ACCCCATCCT 360 

35 

TGTTGGGCAG CTCCCTGCTT TGTCCTGCAT GAACAGAGTT GATGAAAGTG GGGTGTGGGC 42 0 

AACAAGTGGC TTTCCTTGCC TACTTTAGTC ACCCAGCAGA GCCACTGGAG CTGGCTAGTC 480 

40 CAGCCCAGCC ATGGTGCATG ACTCTTCCAT AAGGGATCCT CACCCTTCCA CTTTCATGCA 540 

AGAAGGCCCA GTTGCCACAG ATTATACAAC CATTACCCAA ACCACTCTGA CAGTCTCCTC 600 

CAGTTCCAGC AATGCCTAGA GACATGCTCC CTGCCCTCTC CACAGTGCTG CTCCCCACAC 660 

45 

CTAGCCTTTG TTCTGGAAAC CCCAGAGAGG GCTGGGCTTG ACTCATCTCA GGGAATGTAG 720 

CCCCTGGGCC CTGGCTTAAG CCGACACTCC TGACCTCTCT GTTCACCCTG AGGGCTGTCT 780 

50 TGAAGCCCGC TACCCACTCT GAGGCTCCTA GGAGGTACCA TGCTTCCCAC TCTGGGGCCT 840 

GCCCCTGCCT AGCAGTCTCC CAGCTCCCAA CAGCCTGGGG AAGCTCTGCA CAGAGTGACC 900 

TGAGACCAGG TACAGGAAAC CTGTAGCTCA ATCAGTGTCT CTTTAACTGC ATAAGCAATA 960 

AGATCTTAAT AAAGTCTTCT AGGCTGTAGG GTGGTTCCTA CAACCACAGC CAAAA.AAAAA 102 0 

AAAAA 1Q 25 



60 
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£2) INFORMATION FOR SEQ ID NO: 2 30: 

5 (l) SEQUENCE CHARACTERISTICS: 

{A) LENGTH: 1250 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : double 

(D) TOPOLOGY : linear 

10 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 230: 
GCCCACGCGT CCGCCCACGC GTCCGGCGGT GCGGAGTATG GGGCGCTGAT GGCCATGGAG 60 
15 GGCTACTGGC GCTTCCTGGC GCYGCTGGGG TCGGCACTGC TCGTCGGCTT CCTGTCGGTG 120 



ATSTTCGCCC TCGTCTGGGT CCTCCACTAC CGAGAGGGGC TTGGCTGGGA TGGGAGCGCA 180 



CTAGAGTTTA ACTGGCACCC AGTGCTSATG GTCACCGGCT TCGTCTTCAT CCAGGGCATC 240 

20 

GCATCATCGT CTACAGACTG CCGTGGACCT GGAAATGCAG CAAGCTCCTG ATGAAATCCA 3 00 



TCCATGCAGG GTTAAATGCA GTTGCTGCCA TTCTTGCAAT TATCTCTGTG GTGGCCGTGT 360 
25 TTGAGAACCA CAATGTTAAC AATATAGC C A ATATGTACAG TCTGCACAGC TGGGTTGGAC 420 



TGATAGCTGT CATATGCTAT TTGTTACAGC TTCTTTCAGG TTTTTCAGTC TTTCTGCTTC 480 



CATGGGCTCC GCTTTCTCTC CGAGCATTTC TCATGCCCAT ACATGTTTAT TCTGGAATTG S40 

30 

TCATCTTTGG AACAGTGATT GCAACAGCAC TTATGGGATT GACAGAGAAA CTGATTTTTT 600 



CCCTGAGAGA TCCTGCATAC AGTACATTCC CGCCAGAAGG TGTTTTCGTA AATACGCTTG 660 
35 GCCTTCTGAT CCTGGTGTTC GGGGCCCTCA TTTTTTGGAT AGTCAC CAGA CCGCAATGGA 720 



AACGTCCTAA GGAGCCAAAT TCTACCATTC TTCATCCAAA TGGAGGCACT GAACAGGGAG 780 



CAAGAGGTTC CATGCCAGCC TACTCTGGCA ACAACATGGA CAAATCAGAT TCAGAGTTAA 840 

40 

ACARTGAAGT AGCAGCAAGG AAAAGAAACT TAGCTCTGGA TGAGGCTGGG CAGAGATCTA 900 



CCATGTAAAA TGTTGTAGAG ATAGAGCCAT ATAACGTCAC GTTTCAAAAC TAGCTCTACA 960 
45 GTTTTGCTTC TCCTATTAGC CATATGATAA TTGGGCTATG TAGTATCAAT ATTTACTTTA 1020 



ATCACAAAGG ATGGTTTCTT GAAATAATTT GTATTGATTG AGGCCTATGA ACTGACCTGA 1080 



ATTGGAAAGG ATGTGATTAA TATAAATAAT AGCAGATATA AATTGTGGTT ATGTTACCTT 1140 

50 

TATCTTGTTG AGGACCACAA CATTAGCACG GTGCCTTGTG CAKAATAGAT ACTCAATATG 1200 
TGAATATGTG TCT ACT ACTA GTTAATTGGA TAAACTGGCA GCATCCCTGA 1250 

55 

(2) INFORMATION FOR SEQ ID NO: 231: 
60 (i) SEQUENCE CHARACTERISTICS: 
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[A! LENGTH: IS 11 base pairs 








(3) TYPE: nucleic acid 








(C! STRANT>EDNESS : double 








(G) TOPOLOGY: linear 






5 


;xi) SEQUENCE DESCRIPTION: SEQ 10 NC : 231: 








CNGNCAGTAC CGGTCNGATT CCCGGGTCGA CCCACGCGTC CGCTGCATTC 


CAGGGCCTTT 


50 


10 


CAGTGGCTTT CATTCTGAAG TTCCTGGATA ACATGTTCCA TGTCTTGATG 


GCCCAGGTTA 


120 




CCASTGTCAT TATCACAACA GTCTCTGTCC TGGTCTTTGA CTTCAGGCCC 


TCCCTGGAAT 


180 


15 


TTTTCTTGGA AGCCSCATCA GTCSTYCTCT CTATATTTAT TTATAATGCC 


AGCAAGCCTC 


243 


AAGTTC CGG A ATACGCACCT AGGCAAGAAA GGATCCGAGA TCTAAGTGGG 


AATCTTTGGG 


300 




AGCGTTCCAG TGGGGATGGA G AAGAACT AG AAAGACTTAC CAAACCCAA3 


AGTGATGAGT 


363 


20 


CAGATGAAGA TACTTTCTAA CTGGTACCCA CATAGTTTGC AGCTCTCTTG 


AACCTTATTT 


420 




TCACATTTTC AGTGTTTGTA ATATTTATCT TTTCACTTTG ATAAACCAGA 


AATGTTTCTA 


480 


25 


AATCCTAATA TTCTTTGCAT ATATCTAGCT ACTCC CT AAA TGGTTCCATC 


CAAGGCTTAG 


540 


AGTACCCAAA GGCTAAGAAA TTCTAAAGAA CTGATACAGG AGTAACAATA 


TGAAGAATTC 


600 




ATTAATATCT CAGTACTTGA TAAATCAGAA AGTTATATGT GCAGATTATT 


TTCCTTGGCC 


660 


30 


TTCAAGCTTC CAAAAAACTT GTAATAATCA TGTTAGCTAT AGCTTGTATA TACACATAGA 


720 




GATCAATTTG CCAAATATTC ACAATCATGT AGTTCTAGTT TACATGCCAA 


AGTCTTCCCT 


780 


35 


TTTTAACATT ATAAAAGCTA GGTTGTCTCT TGAATTTTGA GGCCCTAGAG ATAGTCATTT 
TGCAAGTAAA GAGCAACGGG ACCCTTTCTA AAAACGTTGG TTGAAGGACC TAAATACCTG 


840 

9C0 




GCCATACCAT AGATTTGGGA TGATGTAGTC TGTGCTAAAT ATTTTGCTGA 


AGAAGCAGTT 


960 


40 


TCTCAGACAC AACATCTCAG AATTTTAATT TTTAGAAATT CATGGGAAAT 


TOGATTTTTG 


1020 




TAATAATCTT TTGATGTTTT AAACATTGGT TCCCTAGTCA CCATAGTTAC 


CACTTGTATT 


1080 


45 


TTAAGTCATT TAAACAAGCC ACGGTGGGGC TTTTTTCTCC TCAGTTTGAG 
TTGATGTCAT TACTCCTGAA TTATTACATT TTGGAGAATA AGAGGGCATT 


GAGAAAAATC 
TTATTTTATT 


1140 

1200 




AGTTACTAAT TCAAGCTGTG ACTATTGTAT ATCTTTCCAA GAGTTGAAAT 


GCTGGCTTCA 


1260 


50 


GAATCATACC AGATTGTCAG TGAAGCTGAT GCCTAGGAAC TTTTAAAGGG 


ATCCTTTCAA 


1320 




AAGGATCACT TAGCAAACAC ATGTTGACTT TTAACTGATO TATGAATATT 


AATACTCTAA 


1380 


55 


AAATAGAAAG ACCAGTAATA TATAAGTCAC TTTACAGTGC TACTTCACAC 
ATGGTATTTT TCATGGTATT TTGCATGCAG CCAGTTAACT CTCGTAGATA 


TTAAAAGTGC 
GAGAAGTCAG 


1440 

1500 




GTGATAGATG AT ATT AAAAA TTAGCAAACA AAAGTGACTT GCTCAGGGTC 


ATGCAGCTGG 


1560 


60 


GTGATGATAG AAGAGTGGGC TTTAACTGGC AGGCCTGTAT GTTTACAGAC 


TACCATACTG 


1620 
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30 



TAAA7ATGAG CTTTATGGTG TCATTCTCAG AAACTTATAC ATTTCTCJCTC TCCTTTCTCC 1630 

T AAGTTTC AT GCAGATGAAT ATAAG3TAAT AT ACT ATT AT ATAATTCATT TGTGATATCC 174 0 

ACAATAATAT GACTGGCAAG AATTGGTGGA AATTTGTAAT TAAAATAATT ATTAAACCTA 1800 

AAAAAAAAAN N . 1311 

(2) INFORMATION FOR 3EQ ID NO: 232: 



15 (i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 2271 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDMESS : double 

(D) TOPOLOGY: linear 

20 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 232: 
CTGACCTCAT GGCGTAGAGC CTAGCAACAG CGCAGGCTCC CAGCCGAGTC CGTTATGGCC 60 
25 GCTGCCGTCC CGAAGAGGAT GAGGGGGCCA GCACAAGCGA AACTGCTGCC CGGGTCGGCC 120 
ATCCAAGCCC TTGTGGGGTT GGCGCGGCCG CTGGTCTTGG CGCTCCTGCT TGTGTCCGCC 180 
GCTCTATCCA GTGTTGTATC ACGGACTGAT TCACCGAGCC CAACCGTACT CAACTCACAT 240 
ATTTCTACCC CAAATGTGAA TGCTTTAACA CATGAAAACC AAACCAAACC TTCTATTTCC 300 
CAAATCAGCA CCACCCTCCC TCCCACGACG AGTACCAAGA AAAGTGGAGG AGCATCTGTG 360 
35 GTCCCTCATC CCTCGCCTAC TCCTCTGTCT CAAGAGGAAG CTGATAACAA TGAAGATCCT 420 
AGTATAGAGG AGGAGGATCT TCTOATGCTG AACAGTTCTC CATCCACAGC CAAAGACACT 480 
CTAGACAATG GCGATTATGG AGAACCAGAC TATGACTGGA CCACGGGCCC CAGGGACGAC 540 

40 

GACGAGTCTG ATNGACACCT TGGAAGAAAA CAGGGGTTAC ATGGAAATTG AACAGTCAGT 600 
GAAATCTTTT AAGATGCCAT CCTCAAATAT AGAAGAGGAA GACAGCCATT TCTTTTTTCA 660 
45 TCTTATTATT TTTGCTTTTT GCATTGCTGT TGTTTACATT ACATATCACA ACAAAAGGAA 720 
GATTTTTCTT CTGGTTCAAA GCAGGAAATG GCGTGATGGC CTTTGTTCCA AAACAGTGGA 780 
ATACCATCGC CTAGATCAGA ATGTTAATGA GGCAATGCCT TCTTTGAAGA TTACCAATGA 840 

50 

TTATATTTTT TAAAGCACTG TGATTTGAAT TTGCTTATGT AATTTTATTT GCTTGACTTT 900 
TTATATGATA TTGTGCAAAT GTTTGCCATA GGCAATTGGT ACTTAAATGA GAGGTGAGTC 960 
55 TCTCTTTTGC CTTGGTGCTT TGGAAATTAA ATGTCACAAA CGAGTATATA ATTTTTTATC 1020 
TGTACTTTTA GAGCTGAGTT TAATCAGGTG TCCAAAATGT GAGTTAAACA TTACCTTATA 1080 
TTTACACTGT TAGTTTTTAT TGTTTTAGAT TTATTATGCT TCTTCTGGAA GTATTAGTGA 1140 

60 
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TGCT AC TT IT AAAAGA7CCC AAACTTGTAA CTAAATTCTG AC AT AT C TGT 7ACTGCTGAC 12 00 

TCACATTCAT TCTCCGCCAT TCAAATACTA TTTTTTATCC ACATTTTTTT TTGTTCCCAA 1260 

5 ACTGTAATGT ACAAGGATAT GTGTGATAAT GCTTTGGATT TGAGTAATAT TTTTTTTTCT 1320 

TCCAAGAAAA CTGCTTTGGA TATTTTTAGA TAATTTAAAC ATAATTTAGG AT AATGATAT 13 80 

TGC~CAATCT GACCACAATT TTAGGTAAAA CATTAAATGT GTCAAGAAAT CTTGGCAACA 1440 

10 

GAGACTCTGC AGCTTGCAGT GGACATAGAT AAAATGTTAC AGAGATACTA TTTTTTTGGT 1500 

T3GAATTACT ATATTAAATT TAGAAGCAGA AACTGGTAAA ATGTTAAATA CATGTACAAT 1560 

15 TGCTTTTAGT TAGCAATTGA TTGTAGCATG GGTTCCTCCA AGGTTTCAAG CAATGGGCAG 1620 

AGTTTAAAAT TATATCAGAT TCGTTTACTT CGTTTATTAT TTTACAGTAA ATTTGAATAA 1680 

ATCTTAGGGG TCATTATCAC TTAAATAATA CTGTACCTAG GTCTTTCAAA TTAAAATTAT 1740 

20 

ACCTGAATGA AGTTGTTTGT ATACATAAAG GATATTTGTG TACAATTACC TTTTTTCCCC 1800 

CACACTTGTT TTCTTTGTTT TTGTTTTTTA TGGCAACTGG AAAGTATTTA CTATGGGATT 1860 

25 CATTTATGTC TGTCTTTCTA TCATAAAGAA TTGATCAATA TGTAAATATG TGATTTGAAC 192 0 

CATGGTTGAC TTACAAGTGT CACTACAGCT TTTTAGAAAA CATAGCCCTA ATATATGTTA 1980 

AGCAGGACCC GGGTGAGCCA GTGGGCTTGC GCTTTATGTA GAGCTGGAAG AAGGCCGTCC 2 040 

30 

ATCCTGTCTC TTGGGCGGAC AGTGTACTTT CCTAATAGGG AAGGGAAGCA CAATGGAAAT 2100 

ACCCCTGAAC CGTTTTATTG CAGTAATTTT TTTCATATCT GAAAC T ATT A TTTAATATTT 2160 

35 TGAATAAGAT TTTAAAAAAT AAATGGCAAA GATATAAATC TAAAAAAAAA AAAAAAAAAA 2220 

AAAAAAAAAA AAAAAAAAAA AAAAAAAAAA AAAAAAAAAA AAAAAANANA N 2271 



40 



(2) INFORMATION FOR SEQ ID NO: 233: 



(i) SEQUENCE CHARACTERISTICS: 
45 (A) LENGTH: 1338 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : double 

(D) TOPOLOGY : linear 

50 (Xi) SEQUENCE DESCRIPTION : SEQ ID NO: 23 3: 

CTTCCGGTTC TCCGGGCAGC TGCCACTGCT GTAGCTTCTG CCACCTGCCA CGACCGGGCC 60 

TCTCCCTGGC GTTTGGTCAC CTCTGCTTCA TTCTCCACCG CGCCTATGGT CCCTCTTGGA 120 

55 

GCCAGCGTGG CGNGCCTGGC GGCTCCCGGG TGGTGAGAGA GCGGTCCGGG AACGATGAAG 180 

GCCTCGCAGT GCTC<TGCTG TCTCAGCCAC CTCTTGGCTT CCGTCCTCCT CCTGCTGTTG 24 0 

60 CTGCCTGAAC TAAGCGGGYC CCTGGMAGTC CTGCTGCAGG CAGCCGAGGC CGCGCCAGGT 300 
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YTTGGGCCTC CTGACCCTAG ACCAGGACAT TACCGCCGCT GCCACCGGGC CCTOACCCCT 
GCCCAGCAGC CG^GCCGTGG TCTGGCTGAA GCTGCGGGGG CCGCGGGGCT CCGAGGGAGG 
CAATGGCAGC AAXCTGTGG CCGGGCTTGA GACGGACGAT CAC3GAGGGA A3GCCGGGGA 
ARGCTCGGTG GGTGGCGGCC TTGCTGTGAG CCCCAACCCT GGCGACAAGC CCATGACCCA 
GCGGGCC CTG ACCGTGTTGA TGGTGGTGAG CGGCGC'GGTG CTGGTGTACT TCGTGGTCAG 
GACGGTCAGG ATGAGAAGAA GAAACCGAAA GACTAGGAGA TATGGAGTTT TGGACACTAA 
CATAGAAAAT ATGGAATTGA CACCTTTAGA ACAGGATGAT GAGGATGATG ACAACACGTT 
GTTTGATGCC AATCATCCTC G AAGAT AAG A ATGTGCCTTT TGATGAAAGA A OTTTATCTT 
TCTACAATGA AGAGTGGAAT TTCTATGTTT AAGGAATAAG AAGCCACTAT ATCAATGTTG 
GGGGGGTATT TAAGTTACAT ATATTTNAAC AACCTTTAAT TTGCTGTTGC AATAAATACC 
GTATCCTTTT ATTATATCTT TATATGTATA GAAGTACTCT GTTAATGGGC TCAGAGATGT 
TGGGGATAAA GTATACTGTA ATAATTTATC TGTTTGAAAA TTACTATAAA AGGGTGTTTT 
CTGRTGGGTT TTTGTTTCCT GCTTAC CAT A TGATTGTAAA TTGTTTTATG TATTAATCAG 
TTAATGCTAA TTATTTTTGC TGATGTCATA TGTTAAAGAG CTATAAATTC CAACAACCAA 
CTGGTGTGTA AAAATAATTT AAAATYTCCT TTACTGAAAG GTATTTCCCA TTTTTGTGGG 
GAAAAGAAGC CAAATTTATT ACTTTGTGTT GGGGTTTTTA AAATATTAAG AAATGTCTAA 
GTTATTGTTT GCAAAACAAT AAATATGATT TTAAATTCTC TTAAAAAAAA AAAAAAAAAC 
CCCGGGGGGG GGCCCGGN 

(2) INFORMATION FOR SEQ ID NO: 234: 

(i) SEQUENCE CHARACTERISTICS : 

(A) LENGTH: 31 amino acids 

(B) TYPE: amino acid 
(D) TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 234: 

Met Leu Ser Thr Gly He Glu Val Ala Arg Pro Pro Ala Thr Leu Leu 
1 5 10 15 

Gly Leu Met Phe Val Leu Thr Gly Met Pro Arg Gly Leu Arg Xaa 
20 25 30 

(2) INFORMATION FOR SEQ ID NO: 2 35: 

(i) SEQUENCE CHARACTERISTICS : 

(A) LENGTH: 116 amino acids 
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;5) TYPE: amino acid 
;D) TOPOLOGY : linear 
(xi) SEQUENCE DESCRIPTION: SEQ ID NO : 235: 

Met Asn Val Val lie Val He He Leu Phe 5er Phe Asp Ser Val Gly 
1 5 10 15 

Thr Met Phe Ser Cys Asn Arg He Pro Lys He Thr Val Leu Asn Lys 
20 25 30 

Leu Lys Phe Xaa Cys Glu Val Leu Leu Arg He Gin Thr He Gin Gly 
35 4C 45 

Phe Tyr Arg Cys Thr Arg He Ser Arg Tyr Lys Gly He Phe Pre Asp 
50 55 60 

Phe Cys Gin Ser Gin Cys Met Gly Cys Asn Pro Glu Ser Xaa Met Ala 
65 70 75 80 

Val Pro Ala Leu Val Thr Pro He Leu Ala His Arg Lys Lys Glu Lys 
85 90 95 

Gly Met Cys Leu Phe Thr Leu He He Ala Pro Thr Arg Cys Thr His 
100 105 110 

Tyr Phe Cys Xaa 
115 



(2) INFORMATION FOR SEQ ID NO : 236: 

(l) SEQUENCE CHARACTERISTICS : 

(A) LENGTH: 103 amino acids 

(B) TYPE: amino acid 
(D) TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 236: 

Met Ser Ser Ala Lys He Val Arg Gin Arg Gly Ala Val Pro Thr Tyr 
15 10 15 

Tyr Thr Thr Glu Ala Gly Glu He lie Phe Leu Val Leu Asn Trp Ser 
20 25 30 

Leu Ser He Leu His He Val Asp Val Leu Cys Ser Lys Pro Glu Lys 
35 40 45 

Ser Val Thr Glu Asp Ala Ala Ser Gly Leu Ser Gin Arg Met Thr Ala 
50 55 60 

Leu Val Trp Arg Lys Gly Pro Asp Gly Gly Ser Arg Lys Pro He Leu 
65 70 75 80 

Leu Leu Phe Phe Phe Leu Pro Leu He Leu Cys Phe Kis Ser Phe lie 
85 90 95 



Kis Ser Ser Asn He Cys Xaa 
100 
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;:} INFORMATION FOR SEQ ID NO: 2 3": 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 4 2 anino acids 

(3) Ti'PE: amino acid 

(D) TOPOLOGY: linear 
(xi) SEQUENCE DESCRIPTION: SEQ 10 NO: 237: 

Met: He Leu Phe Pro Gin Xaa Ala Leu Arg Leu Gly Xaa Trp Pro Arg 

15 10 15 

Thr Trp Ser He Leu Xaa Lys Tyr Ser Val Asn Phe Phe Ser Ala Tyr 
20 25 30 

Ser Pro Met Gly Ala Val Gly Thr Glu Phe 
35 40 



(2) INFORMATION FOR SEQ ID NO: 238: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 37 amino acids 

(B) TYPE: amino acid 
(D) TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 238: 

Met He He Leu Leu Leu Phe Met Leu Leu Asn Asn Val Val Leu Val 
15 10 15 

Gin Glu Asp Asn Cys Gin Arg Lys Asn Thr Val Gin Glu Arg Arg Xaa 
20 25 30 

Trp Ser Gin Trp Xaa 

35 



(2) INFORMATION FOR SEQ ID NO: 239: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 128 amino acids 

(3) TYPE: amino acid 

(D) TOPOLOGY: linear 
(Xi) SEQUENCE DESCRIPTION: SEQ ID NO: 239: 

Met Ala Ala Xaa Pro Pro Gly Cys Thr Pro Pro Xaa Leu Leu Asp He 
15 10 15 

Ser Trp Leu Thr Glu Ser Leu Gly Ala Gly Gin Pro Val Pro Val Glu 
20 25 30 

Cys Arg His Arg Leu Glu Val Ala Gly Pro Arg Lys Gly Pro Leu Ser 
35 40 45 

Pro Ala Trp Met Pro Ala Tyr Ala Cys Gin Arg Pro Thr Pro Leu Thr 

50 55 60 



His His Asn Thr Gly Leu Ser Glu Leu Leu Glu His Gly Val Cys Glu 
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65 ■'C "- J 5 PO 

Glu Val Glu Arg Val Arg Arg Ser Glu Arg Tyr Gin Thr Met Lys Val 
85 9C 95 

Arg Arg Ala Gly Leu Giy Fro Thr Pro Gly Met Ser Cys Pro Gly Asn 
100 ICS 113 

Asp Asn Thr Val His Thr Met His Giy Glu Ala Asn Arg Gly Ser Xaa 
115 123 125 



(2) INFORMATION FOR SEQ ID MO : 24C : 

(i) SEQUENCE CHARACTERISTICS : 

(A) LENGTH: 57 amino acids 

(B) TYPE: amino acid 
(D) TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 240: 

Met Ser lie Leu Cys Cys Pro Xaa Leu Cys Leu Phe Phe Ser Phe Cys 
1 5 10 15 

lie Ser Ser Gly Ser Cys Pro Phe Ser His Val Ser Gin Leu Ser Phe 

20 25 33 

lie Ala Thr Phe Ser Gin Ser Ser Pro Val Leu Leu Val Pro Ala Tyr 
35 40 45 

Asn Thr Tyr Leu Ser Phe Leu Ala Phe Leu Asp Cys Ala Ser Leu Thr 
50 55 60 

Ser Thr Xaa 
65 



(2) INFORMATION FOR SEQ ID NO: 241: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 69 amino acids 

(B) TYPE: amino acid 
(D) TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 241: 

Met Ser Thr Phe Gin Leu Leu Leu Leu lie Leu Ala Gin Ser Thr Tyr 
15 1C 15 

Lys lie Lys Ser Lys Pro Leu His Met Thr Asn His Thr Leu Leu Asn 
20 25 30 

Ser Pro Giy Leu Asn Pro Ser Ser Pro Thr Leu Asn Phe Lys Thr Gin 
35 40 45 



Gin His Glu Ser Val Ser Tyr Ala Cys Cys His Met Arg Ser Leu His 

53 55 63 
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His a: a Phe Ala Xaa 
63 



(21 INFORMATION FOR SEQ ID NO: 242: 

(i) SEQUENCE CHARACTERISTICS : 

(A) LENGTH : 44 amine acids 

(E) TYPE: ammo acid 

( D ) TOPOLOGY : 1 inear 
(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 242: 

Met Val Ser Val Val Leu lie Phe Ser Phe Leu Ser Leu Thr lie Ser 
15 10 15 

Thr Thr Ala Ser Ala Tyr Asn Gly Asn Asp Thr Gin Gly Trp Asn Asp 

20 25 30 

Lys Phe His Xaa Xaa Ser Val Lys Thr Gin Thr Xaa 
35 40 



(2) INFORMATION FOR SEQ ID NO : 243: 

(i) SEQUENCE CHARACTERISTICS : 

(A) LENGTH: 51 amino acids 

(B) TYPE: amino acid 
<D) TOPOLOGY': linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO : 243: 

Met He Ser Asp Ala Gly Ala Gly Phe Gly Val Phe Leu Leu Val Pro 
15 IC 15 

Arg Ala Gly His Cys Trp Gly Ala Gly Lys Pro Leu Pro Ser Cys Pro 

20 25 30 

Ser Val Ala Ser He Pro Ser Trp Val Leu Pro Ser Phe Leu Glu Arg 
35 40 45 

Gly Arg Xaa 

50 



(2) INFORMATION FOR SEQ ID NO: 244: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH : 43 amino acids 

(B) TYPE: amino acid 
(D) TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 244: 

Met Val Gin Thr He Gin Aso Phe Leu Ser Leu Phe Ser Thr Pro He 
1 5 10 15 



Phe Leu Leu Leu Leu Met Phe Glu Tnr Leu Ser Leu Ala Pro Ala Trp 

20 25 30 
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10 



■2) I NFORMAT I CM ?Z 

;i) secce:;c 

(A) 
(31 
(T) 



15 Met He Leu Me: Pro H. 



Ser -Arg S_n Arg ser val Pro 



Phe Val Pro Thr Leu Asr. Ala Ser 71— Pro Gly Ala Met Ihr Gly Pro 
20 25 30 



20 



Thr Ala Thr Leu Thr Ser Tys 
35 



Thr Thr Ala Cys Arg Val Ser 
45 



Trp Ala Asn Gly Trp Thr Ser Leu Arg; Thr Phe Arg Zaa 
25 50 55 50 



(2) INFORMATION rCR £E~ TD NC : 246: 

30 

(i) SEQUENCE CCAPACTTTHTT-H: 
(D) TCP3LCGY: linsar 

35 (xi) secuz:x3: zzstri?t:::;: seq id nh 245: 

Met Ser His His -Ala C-lr. Pxo Arg Phe Leu Leu lie Thr Met Leu Leu 
1 5 15 15 

40 Gin Giu Ala Lys Pro 7-1 Ser Asr. He Pro His Leu Leu C-lu Ser Trp 
20 25 30 

Tyr Phe Gly Xaa 
35 

45 



(2) INFORMATION FCR SEQ TD 17Z : 247: 

50 (i) SEQUENCE CHA-JACIZEISTTCS : 

(A) LZZI77H: 2: anint acids 
(3) TYPE: amino acid 
(D) TCPCLOGY : linear 
(xi) SEQUENCE lESCRIPTHN: IEQ ID !3C : 247: 

55 

Met Asn Ser Leu Phe Ire Met He Leu Leu Pro Val Ser Gin Asp Gin 
5 15 15 



Val Val Glu Gly Leu C-lr. Gly 

60 20 



He His Met Arg He 

30 
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Leu Arg Lys His Leu Xaa 

35 



(2) INFORMATION FOR SEQ ID NO: 248: 

{ i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 211 arnno acids 

(3) TYPE : amino acid 

( D ) TOPOLOGY : 1 1 near 
ixi) SEQUENCE DESCRIPTION: SEQ ID NO : 248: 

Met Ser Arg Ser Xaa Asp Val Thr Asn Thr Thr Phe Leu Leu Met Ala 
1 5 " 10 15 

Ala Ser lie Tyr Leu His Asp Gin Asn Pro Asp Ala Ala Leu Arg Ala 

20 25 30 

Leu His Gin Gly Asp Ser Leu Glu Cys Thr Ala Met Thr Val Gin lie 
35 40 45 

Leu Leu Lys Leu Asp Arg Leu Asp Leu Ala Arg Lys Glu Leu Lys Arg 
50 55 60 

Met Gin Asp Leu Asp Glu Asp Ala Thr Leu Thr Gin Leu Ala Thr Ala 
65 * 70 75 80 

Trp Val Ser Leu Ala Thr Gly Gly Glu Lys Leu Gin Asp Ala Tyr Tyr 
B5 90 95 

lie Phe Gin Glu Met Ala Asp Lys Cys Ser Pro Thr Leu Leu Leu Leu 
100 105 110 

Asn Gly Gin Ala Ala Cys His Met Ala Gin Gly Arg Trp Glu Ala Ala 
115 120 125 

Glu Gly Leu Leu Gin Glu Ala Leu Asp Lys Asp Ser Gly Tyr Pro Glu 
130 135 140 

Thr Leu Val Asn Leu lie Val Leu Ser Gin His Leu Gly Lys Pro Pro 
145 150 155 160 

Glu Val Thr Asn Arg Tyr Leu Ser Gin Leu Lys Asp Ala His Arg Ser 
165 170 175 

His Pro Phe He Lys Glu Tyr Gin Ala Lys Glu Asn Asp Phe Asp Arg 
180 IBS 190 

Leu Val Leu Gin Tyr Ala Pro Ser Ala Glu Ala Gly Pro Glu Leu Ser 
195 200 205 

Gly Pro Xaa 
210 



(2) INFORMATION FOR SEQ ID NO: 249: 



WO 98 54963 




( i : S EQUENC E CriARAC TER ISTICS : 

(A) LENGTH: 54 3 amino acids 

[B; TYPE: amino acid 

[ D ) TOPOLOGY : i mear 
(xi) SEQUENCE DESCRIPTION: SEQ 10 NC : 249: 

Met Glu Asp Ser Glu Ala Leu Gly Phe Glu His Met Gly Leu A^p Pro 
I ' 5 10 15 

Arg Leu Leu Gin Ala Val Thr Asp Leu Gly Trp Ser Arg Pro Thr Leu 
20 25 30 

He Gin Glu Lys Ala He Pro Leu Ala Leu Glu Gly Lys Asp Leu Leu 
35 40 45 

Ala Ara Ala Arg Thr Gly Ser Gly Lys Thr Ala Ala Tyr Ala He Pro 
50 55 60 

Met Leu Gin Leu Leu Leu His Arg Lys Ala Thr Gly Pro Val Val Glu 
55 70 75 80 

Gin Ala Val Arg Gly Leu Val Leu Val Pro Thr Lys Glu Leu Ala Arg 
85 90 95 

Gin Ala Gin Ser Met He Gin Gin Leu Ala Thr Tyr Cys Ala Arg Asp 
100 105 HO 

Val Arg Val Ala Asn Val Ser Ala Ala Glu Asp Ser Val Ser Gin Arg 
115 120 125 

Ala Val Leu Met Glu Lys Pro Asp Val Val Val Gly Thr Pro Ser Arg 
130 135 140 

He Leu Ser His Leu Gin Gin Asp Ser Leu Lys Leu Arg Asp Ser Leu 
145 150 155 160 

Glu Leu Leu Val Val Asp Glu Ala Asp Leu Leu Phe Ser Phe Gly Phe 
165 170 175 

Glu Glu Glu Leu Lys Ser Leu Leu Cys His Leu Pro Arg He Tyr Gin 
180 185 190 

Ala Phe Leu Met Ser Ala Thr Phe Asn Glu Asp Val Gin Ala Leu Lys 
195 200 205 

Glu Leu He Leu His Asn Pro Val Thr Leu Lys Leu Gin Glu Ser Gin 
210 215 220 

Leu Pro Gly Pro Asp Gin Leu Gin Gin Phe Gin Val Val Cys Glu Thr 
225 " 230 235 240 

Glu Glu Asp Lys Phe Leu Leu Leu Tyr Ala Leu Leu Lys Leu Ser Leu 
245 250 255 

He Arg Gly Lys Ser Leu Leu Phe Val Asn Thr Leu Glu Arg Ser Tyr 
260 265 270 



Arg Leu Arg Leu Phe Leu Glu Gin Phe Ser He Pro Thr Cys Val Leu 

275 28C 285 
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Asn Gly Glu Leu Pro Leu Arg Ser Arg Cys His lie He Ser Glr. ?ne 
290 295 30C 

Asn Gin Gly Phe Tvr Asp Cys Val He Ala Thr Asp Ala Glu Val Leu 
5 3G5 310 315 320 

Gly Ala Pro Val Lys Gly Lys Arg Arg Gly Arg Gly Pro Lys Gly Asp 
325 330 335 

10 Lys Ala Ser Asp Pro Glu Ala Gly Val Ala Arg Gly He Asp Phe His 
340 345 350 

His Val Ser Ala Val Leu Asn Phe Asp Leu Pro Pro Thr Pro Glu Ala 
355 360 365 

15 

Tyr He His Arg Ala Gly Arg Thr Ala Arg Ala Asn Asn Pro Gly He 
370 375 380 

Val Leu Thr Phe Val Leu Pro Thr Glu Gin Phe His Leu Gly Lys lie 
20 335 390 395 400 

Glu Glu Leu Leu Ser Gly Glu Asn Arg Gly Pro He Leu Leu Pro Tyr 
405 410 415 

25 Gin Phe Arg Met Glu Glu He Glu Gly Phe Arg Tyr Arg Cys Arg Asp 
420 425 430 

Ala Met Arg Ser Val Thr Lys Gin Ala He Arg Glu Ala Arg Leu Lys 
435 440 445 

30 

Glu He Lys Glu Glu Leu Leu His Ser Glu Lys Leu Lys Thr Tyr Phe 
450 455 460 

Glu Asp Asn Pro Arg Asp Leu Gin Leu Leu Arg His Asp Leu Pro Leu 
35 465 ' 470 475 480 

His Pro Ala Val Val Lys Pno His Leu Gly His Val Pro Asp Tyr Leu 
485 490 495 

40 Val Pro Pro Ala Leu Arg Gly Leu Val Arg Pro His Lys Lys Arg Lys 
500 505 510 

Lys Leu Ser Ser Ser Cys Arg Lys Ala Lys Arg Ala Lys Ser Gin Asn 
515 520 525 

45 

Pro Leu Arg Ser Phe Lys His Lys Gly Lys Lys Phe Arg Pro Thr Ala 
530 " 535 540 

Lys Pro Ser Xaa 
50 545 



(2) INFORMATION FOR SEQ ID NO: 250: 

55 

( i ) SEQUENCE CHARACTERISTICS : 

(A) LENGTH: 299 amino acids 

(B) TYPE: amino acid 
(D) TOPOLOGY : linear 

60 (xi) SEQUENCE DESCRIPTION: SEQ ID NO: 250: 
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Met Thr Thr Val Pro Pro Ser Pro Arg Pro Met Ser Arg Pro Ser Giu 
15 10 15 

Arg Asn Met Arg Arg Pro Arg Gly Pro Ser Pro Leu Pro Ala Ser Pro 

20 25 30 

Arg Asn Ser Thr Pro Asp Glu Pro Asp Val His Phe Ser Lys Lys Pne 
35 40 45 

Leu Asn Val Phe Met Ser Gly Arg Ser Arg Ser Ser Ser Ala Glu Ser 

50 55 60 

Phe Gly Leu Phe Ser Cys He He Asn Gly Glu Glu Gin Glu Gin Thr 
65 70 75 80 

His Arg Ala He Phe Arg Phe Val Pro Arg His Glu Asp Glu Leu Glu 

85 90 95 

Leu Glu Val Asp Asp Pro Leu Leu Val Glu Leu Gin Ala Glu Asp Tyr 
100 105 110 

Trp Tyr Glu Ala Tyr Asn Met Arg Thr Gly Ala Arg Gly Val Phe Pro 
115 120 125 

Ala Tyr Tyr Ala lie Glu Val Thr Lys Glu Pro Glu His Met Ala Ala 
130 135 140 

Leu Ala Lys Asn Ser Asp Trp Val Asp Gin Phe Arg Val Lys Phe Leu 
145 150 155 160 

Gly Ser Val Gin Val Pro Tyr His Lys Gly Asn Asp Val Leu Cys Ala 
165 170 175 

Ala Met Gin Lys He Ala Thr Thr Arg Arg Leu Thr Val His Phe Asn 
180 185 190 

Pro Pro Ser Ser Cys Val Leu Glu He Ser Val Arg Gly Val Lys He 
195 200 205 

Gly Val Lys Ala Asp Asp Ser Gin Glu Ala Lys Gly Asn Lys Cys Ser 
210 215 220 

His Phe Phe Gin Leu Lys Asn He Ser Phe Cys Gly Tyr His Pro Lys 
225 230 235 240 

Asn Asn Lys Tyr Phe Gly Phe He Thr Lys His Pro Ala Asp His Arg 
245 250 255 

Phe Ala Cys His Val Phe Val Ser Glu Asp Ser Thr Lys Ala Leu Ala 
260 265 270 

Giu Ser Val Gly Arg Ala Phe Gin Gin Phe Tyr Lys Gin Phe Val Glu 
275 280 285 



Tyr Thr Cys Pro Thr Glu Asp He Tyr Leu Glu 
290 295 
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[?.) I NFORMAT I 2 N FOR SEC IC N'C : 251: 

(i! SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 4 0 amino acids 

(B) TYPE: anino acid 
(D) TO PC 1 LOGY : linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 2 51: 

Leu Leu Tyr Leu Leu Lys Val Xaa Val lie Phe Val Phe Ser Ser Ser 
15 13 15 

Lys Gly Val Thr Leu Val Ser Met Asn Leu Thr Ser Phe Phe Val Ser 
20 25 30 

Ser Val Leu Ala Cys Phe Ser Xaa 
35 40 



(2) INFORMATION FOR SEQ ID NO: 252: 

Ci) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 594 amino acids 

(B) TYPE: amino acid 
(D) TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 252: 

Met Pro Ala Ser Ser Leu Glu Ser Arg Ser Phe Leu Leu Ala Lys Lys 
15 10 15 

Ser Gly Glu Asn Val Ala Lys Phe lie lie Asn Ser Tyr Pro Lys Tyr 
20 25 30 

Phe Gin Lys Asp lie Ala Glu Pro His He Pro Cys Leu Met Pro Glu 
35 40 45 

Tyr Phe Glu Pro Gin He Lys Asp He Ser Glu Ala Ala Leu Lys Glu 
50 55 60 

Arg He Glu Leu Arg Lys Val Lys Ala Ser Val Asp Met Phe Asp Gin 
65 70 75 80 

Leu Leu Gin Ala Gly Thr Thr Val Ser Leu Glu Thr Thr Asn Ser Leu 
85 90 95 

Leu Asp Xaa Leu Cys Tyr Tyr Gly Asp Gin Glu Pro Ser Thr Asp Tyr 
100 105 110 

His Phe Gin Gin Thr Gly Gin Ser Glu Ala Leu Glu Glu Glu Asn Asp 
115 120 125 

Glu Thr Ser Arg Arg Lys Ala Gly His Gin Phe Gly Val Thr Trp Arg 
130 135 140 

Ala Lys Asn Asn Ala Glu Arg lie Phe Ser Leu Met Pro Glu Lys Asn 
145 150 155 150 



Glu His Ser Tyr Cys Thr Met lie Arg Gly Met Val Lys His Arg Ala 
165 170 175 
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Tyr Glu Gin Ala Leu Asn Leu Tyr Thr Glu Leu Leu Asn Asn Arg Leu 
180 13d 19C 

His Ala Asp Val Tyr Thr Phe Asa Ala Leu lie Glu Ala Thr Val Cys 
195 200 2C5 

Ala He Asn Glu Lys Phe Glu Glu Lys Trp Ser Lys He Leu Glu Leu 
21C 215 220 

Leu Arg His Met Val Ala Gin Lys Val Lys Pro Asn Leu Gin Thr Phe 
225 230 235 240 

Asn Thr He Leu Lys Cys Leu Arg Arg Phe His Val Phe Ala Arg Ser 
245 250 255 

Pro Ala Leu Gin Val Leu Arg Glu Met Lys Ala He Gly He Glu Pro 
260 265 270 

Ser Leu Ala Thr Tyr His His He He Arg Leu Phe Asp Gin Pro Gly 
275 280 2S5 

Asp Pro Leu Lys Arg Ser Ser Phe He He Tyr Asp He Met Asn Glu 
290 295 300 

Leu Met Gly Lys Arg Phe Ser Pro Lys Asp Pro Asp Asp Asp Lys Phe 
305 310 315 320 

Phe Gin Ser Ala Met Ser He Cys Ser Ser Leu Arg Asp Leu Glu Leu 
325 330 335 

Ala Tyr Gin Val His Gly Leu Leu Lys Thr Gly Asp Asn Trp Lys Phe 
340 345 350 

He Gly Pro Asp Gin His Arg Asn Phe Tyr Tyr Ser Lys Phe Phe Asp 
355 360 365 

Leu He Cys Leu Met Glu Gin He Asp Val Thr Leu Lys Trp Tyr Glu 
370 375 380 

Asp Leu He Pro Ser Ala Tyr Phe Pro His Ser Gin Thr Met He His 
385 390 395 400 

Leu Leu Gin Ala Leu Asp Val Ala Asn Arg Leu Glu Val lie Pro Lys 
405 410 415 

He Trp Lys Asp Ser Lys Glu Tyr Gly His Thr Phe Arg Ser Asp Leu 
420 425 430 

Arg Glu Glu He Leu Met Leu Met Ala Arg Asp Lys His Pro Pro Glu 
435 440 445 

Leu Gin Val Ala Phe Ala Asp Cys Ala Ala Asp He Lys Ser Ala Tyr 
450 455 460 

Glu Ser Gin Pro He Arg Gin Thr Ala Gin Asp Trp Pro Ala Thr Ser 
465 470 475 480 



Leu Asn Cys He Ala He Leu Phe Leu Arg Ala Gly Arg Thr Gin Glu. 

485 490 495 
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Ala Trp Lys Met Leu Gly Leu Phe 

503 

Ser Glu Leu Leu Asr. Glu Leu Met 

515 520 

Fro Ser Gin Ala He Glu Val Val 
530 535 

Pro He Cys Glu Gly Leu Thr Gin 
545 ' 550 

Asn Gin Glu Gin Lys Glu Ala Leu 
565 

Asp Ser Asp Thr Asp Ser Ser Ser 
580 

Gly Lys 



Arg Lys Kis Asn Lys He Pro Arg 
505 510 

Asp Ser Ala Lys Val Ser Asn Ser 

525 

Glu Leu Ala Ser Ala Phe Ser Leu 
540 

Arg Val Met Ser Asp Phe Ala He 
555 560 

Ser Asn Leu Thr Ala Leu Thr Ser 
570 575 

Asp Ser Asp Ser Asp Thr Ser Glu 
585 590 



(2) INFORMATION FOR SEQ ID NO: 253: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 131 amino acids 

(B) TYPE: amino acid 
(D) TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 253: 

Met Lys Leu Asn Leu Cys He Pro Asn Trp Ala Arg Cys Pro Leu Leu 
15 10 15 

Leu Leu Phe Pro Gin Leu Leu Pro Phe Gin Gly Glu Asp Asp Asp Pro 
20 25 30 

Leu Lys Ala Lys Ala Ala Asn Leu Val Glu Ala Val Pro Trp Gly He 
35 40 45 

Lys Ala Pro Ser Phe Gin Val Thr Cys Leu Val Arg Val Gin Leu Gin 
50 55 60 

Ser Cys Thr Pro Ser Arg Pro Ser Thr Leu Leu Ala Thr Ser Gin Ser 
65 70 75 80 

Pro Gly Arg He Ser Cys Tyr Ser Pro Leu Ser His Leu Pro Pro Val 
85 90 95 

Thr Thr Ser He Gin Pro Ser Pro Val Met Val Pro Phe Gin Tyr Gin 
100 105 HO 

Ala Phe Leu Leu Gin Val Lys Glu Pro Ala Ala Gin Thr Leu Leu Gly 
115 120 125 



Gin Gin Xaa 
130 
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(2) INFORMATION FOR SEQ ID NO : 254: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 21 amno acids 

(B) TYPE: amino acid 
(D) TOPOLOGY: Linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO : 2 54: 

Met Arg Tyr His Ala Gin Leu He Phe Cys He Phe Cys Xaa Phe Val 
15 10 15 

Phe Val Xaa Lys Xaa 

20 



(2) INFORMATION FOR SEQ ID NO : 255: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 31 amino acids 

(B) TYPE: amino acid 
(D) TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 255: 

Met Asn Asp Asn Ser Pro Asn His Ser Ser Ser Tyr Leu Pro Leu Pro 
15 10 15 

Leu Thr He Val He Leu Gin Thr Gly His Lys Gly Thr Leu Xaa 
20 25 30 



(2) INFORMATION FOR SEQ ID NO: 256: 

(l) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 219 amino acids 

(B) TYPE: amino acid 
(D) TOPOLOGY : linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO : 256: 

Met His Phe Leu Phe Arg Phe He Val Phe Phe Tyr Leu Trp Gly Leu 
15 10 15 

Phe Thr Ala Gin Arg Gin Lys Lys Glu Glu Ser Thr Glu Glu Val Lys 
20 25 30 

He Glu Val Leu His Arg Pro Glu Asn Cys Ser Lys Thr Ser Lys Lys 
35 40 45 

Gly Asp Leu Leu Asn Ala His Tyr Asp Gly Tyr Leu Ala Lys Asp Gly 
50 55 60 

Ser Lys Phe Tyr Cys Ser Arg Thr Gin Asn Glu Gly His Pro Lys Trp 
65 70 75 80 

Phe Val Leu Gly Val Gly Gin Val lie Lys Gly Leu Asp He Ala Met 
85 90 95 



Thr Asp Met Cys Pro Gly Glu Lys Arg Lys Val Val He Pro Pro Ser 
100 105 HO 
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Phe Ala Tyr Gly Lys Glu Gly Tyr Ala Glu Gly Lys lie Pro Fro Asp 
115 120 125 

Ala Thr Leu He Phe Glu He Glu Leu Tyr Ala Val Thr Lys Gly Pro 
130 135 140 

Arg Ser He Glu Thr Phe Lys Gin He Asp Met Asp Asn Asp Arg Gin 
145 150 155 160 

Leu Ser Lys Ala Glu He Asn Leu Tyr Leu Gin Arg Glu Phe Glu Lys 
165 170 175 

Asp Glu Lys Pro Arg Asp Lys Ser Tyr Gin Asp Ala Val Leu Glu Asp 
180 185 190 

He Phe Lys Lys Asn Asp His Asp Gly Asp Gly Phe lie Ser Pro Lys 
195 200 205 

Glu Tyr Asn Val Tyr Gin His Asp Glu Leu Xaa 
210 215 



(2) INFORMATION FOR SEQ ID NO: 257: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 50 amino acids 

(B) TYPE: amino acid 
(D) TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 257: 

Met Trp Val He Arg Val Phe Gin Lys Thr Phe Leu Phe Phe Val Leu 
15 10 15 

Phe Trp Ser Val His Cys He Ser Asp Lys Phe Gly Cys Leu Trp His 
20 25 30 

Val Cys Met Lys Arg Glu Gly Asp Xaa Asn Cys Leu Ser Phe Ser Xaa 
35 40 45 

Leu Xaa 

50 



(2) INFORMATION FOR SEQ ID NO: 258: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 122 ammo acids 

(B) TYPE : amino acid 
(D) TOPOLOGY: linear 

(Xi) SEQUENCE DESCRIPTION: SEQ ID NO: 258: 

Met Pro Ser Gin Thr Glu Xaa Phe Ala Ala Cys Gly Gly His Ser Leu 

15 10 15 



Leu Leu Val Xaa Leu Pro Leu Gly Leu Pro Phe Cys Pro Arg Ala Ala 

20 25 3C 
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Leu Cys As? Leu PrD Phe Ser Leu Pro Ser Phe Pro Gly Glr. Ala Arg 
35 40 45 

Arg Gly Gly Ala Glu Lys Gin Gly Ala Glu Gly Arg Gly Leu Gin Val 
5 50 55 6C 

Lys Pro Arg Gly Gin Arg Thr Phe Gin Val Ser Arg Thr Ala Pro Ala 

65 70 75 30 

10 Ala Pro Arg Ser Arc Gin Pro Arg Pro Pro Ala Ala Leu Pro Ala Leu 

35 90 95 

Gly Phe Gly Gly Arg Gly Val Ala Lys Gly Arg Phe Leu Cys Phe Trp 

100 105 110 

15 

Cys Leu Tyr Met Leu Arg He Asp Gin Xaa 
115 120 



20 

(2) INFORMATION FOR SEQ ID NO: 259: 

(i) SEQUENCE CHARACTERISTICS : 

(A) LENGTH : 88 amino acids 
25 (B) TYPE: amino acid 

(D) TOPOLOGY : linear 
(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 259: 

Met Thr Ala Phe Cys Ser Leu Leu Leu Gin Ala Gin Ser Leu Leu Pro 
30 1 5 10 15 

Arg Thr Met Ala Ala Pro Gin Asp Ser Leu Arg Pro Gly Glu Glu Asp 
20 25 30 

35 Glu Gly Met Gin Leu Leu Gin Thr Lys Asp Ser Met Ala Lys Gly Ala 
35 40 45 

Arg Pro Gly Ala Xaa Arg Gly Arg Ala Arg Trp Gly Leu Ala Tyr Thr 
50 55 60 

40 

Leu Leu His Asn Pro Thr Leu Gin Val Phe Arg Lys Thr Ala Leu Leu 
65 70 75 80 

Gly Ala Asn Gly Ala Gin Pro Xaa 
45 85 



50 



(2) INFORMATION FOR SEQ ID NO: 260: 



(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 26 amino acids 

(B) TYPE: amino acid 
(D) TOPOLOGY: linear 

55 (xi) SEQUENCE DESCRIPTION : SEQ ID NO: 260: 

Met lie Gin Val Ser Val Pro Leu Leu Thr He Met lie Phe Leu Leu 
15 10 15 



60 Tyr Leu Gin He Gly Pro Gly Lys Leu Xaa 
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(2) INFORMATION FOR SEQ ID NO: 261: 

( i ) SEQUENCE CHARACTER I ST ICS : 

(A) LENGTH : 2 9 amine acids 

(3) TYPE: amino acid 

(0) TOPOLOGY: linear 
(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 261: 

Met Leu Leu Asp Pro Phe lie Leu Leu Phe Cys Leu Phe Ser Thr Ala 
1 5 10 IS 

Ala Gin Ser Cys Leu Glu Phe He Tyr He Sin Phe Xaa 
20 25 



(2) INFORMATION FOR SEQ ID NO: 262: 

(l) SEQUENCE CHARACTERISTICS : 

(A) LENGTH : 44 amino acids 

(B) TYPE: amino acid 
(D) TOPOLOGY : linear 

(Xi) SEQUENCE DESCRIPTION: SEQ ID NO: 262: 

Met Lys Phe Leu Ser He Leu Leu Asp Asp Asn Asn Phe Xaa Leu Met 
1 5 10 15 

Leu Met Leu Ala Pro Phe Gly Cys Leu Ala Phe Glu Arg Ser Met Lys 

20 25 30 

Met Arg Asn Gly Ala Leu Gly Leu Glu Glu Val Xaa 
35 40 



(2) INFORMATION FOR SEQ ID NO: 263: 

(i) SEQUENCE CHARACTERISTICS : 

(A) LENGTH: 3 63 amine acids 

(B) TYPE: ammo acid 
(D) TOPOLOGY : linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 263: 

Met Arg Thr Leu Phe Asn Leu Leu Trp Leu Ala Leu Ala Cys Ser Pro 
15 10 15 

Val His Thr Thr Leu Ser Lys Ser Asp Ala Lys Lys Ala Ala Ser Lys 
20 25 3C 

Thr Leu Leu Glu Lys Ser Gin Phe Ser Asp Lys Pro Val Gin Asp Arg 
35 40 45 

Gly Leu Val Val Thr Asp Leu Lys Ala Glu Ser Val Val Leu Glu His 
50 55 60 



Arg Ser Tyr Cys Ser Ala Lys Ala Arg Asp Arg His Phe Ala Gly Asp 
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65 70 -5 80 

val Leu Gly Tyr Val Thr Pre Trp Asn Ser His Gly Tyr Asp Val Thr 
35 90 95 

Lys Val Phe 3iy Ser Lys Phe Thr Gin He Ser Pro Val Trp Leu Gin 
ICC 105 110 

Leu Lys Arg Arc Gly Arg Glu Met. Phe Glu Val Thr Gly Leu His Asp 
115 120 125 

Val Asp Gin Gly Trp Me" Arc Ala Val Arg Lys His Ala Lys Gly Leu 
130 135 140 

His He Val Pro Arg Leu Leu Phe Glu Asp Trp Thr Tyr Asp Asp Phe 
145 150 155 160 

Arg Asn Val Leu Asp Ser Glu Asp Glu He Glu Glu Leu Ser Lys Thr 
165 170 175 

Val Val Gin Val Ala Lys Asn Gin His Phe Asp Gly Phe Val Val Glu 
1B0 185 190 

Val Trp Asn Gin Leu Leu Ser Gin Lys Arg Val Thr Asp Gin Leu Gly 
195 200 205 

Met Phe Thr His Lys Glu Phe Glu Gin Leu Ala Pro Val Leu Asp Gly 
210 215 220 

Phe Ser Leu Met Thr Tyr Asp Tyr Ser Thr Ala His Gin Pro Gly Pro 
225 230 235 240 

Asn Ala Pro Leu Ser Trp Val Arg Ala Cys Val Gin Val Leu Asp Pro 
245 250 255 

Lys Ser Lys Trp Arg Ser Lys He Leu Leu Gly Leu Asn Phe Tyr Gly 
260 265 270 

Met Asp Tyr Ala Thr Ser Lys Asp Ala Arg Glu Pro Val Val Gly Ala 
275 280 285 

Arg Tyr He Gin Thr Leu Lys Asp His Arg Pro Arg Met Val Trp Asp 
290 295 300 

Ser Gin Xaa Ser Glu His Phe Phe Glu Tyr Lys Lys Ser Arg Ser Gly 
305 310 315 320 

Arg His Val Val Phe Tyr Pro Thr Leu Lys Ser Leu Gin Val Arg Leu 
325 330 335 

Glu Leu Ala Arg Glu Leu Gly Val Gly Val Ser He Trp Glu Leu Gly 
340 345 350 

Gin Gly Leu Asp Tyr Phe Tyr Asp Leu Leu Xaa 
355 360 



(2) INFORMATION FOR SEQ ID NO: 264: 
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H) SEQUENCE CHARACTERISTIC;:: 

(A: LENGTH: 12 3 air.inc acids 
(3! TYPE : anir.o acid 
(D! TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 264: 



Leu Pro Thr Lys 
1 

He Gly Gin Pro 
20 

Glu Lys Gly Ala 
35 

Leu Lys His Val 

50 



He Leu Val Lys 
5 

Thr Val Ser Tyr 

Arg Gin Thr Gly 

40 

Tyr Glu He Ala 
55 



Pro Asp Arg Thr 

Phe Leu Lys Ala 

25 

Lys Glu Val Ala 

Arg He Lys Ala 
60 



Phe Glu He Lys 
15 

Ala Ala Gly lie 
30 

Gly Leu Val Thr 
45 

Gin Asp Glu Ala 



Phe Ala Leu Gin Asp Val Pro Leu Ser Ser Val Val Arg Ser He He 
65 70 75 80 

Gly Ser Ala Arg Ser Leu Gly lie Arg Val Val Lys Asp Leu Ser Ser 
35 90 95 

Glu Glu Leu Ala Ala Phe Gin Lys Glu Arg Ala He Phe Leu Ala Ala 
100 105 110 

Gin Lys Glu Ala Asp Leu Ala Ala Gin Glu Glu Ala Ala Lys Lys Xaa 
115 12C 125 



(2) INFORMATION FOR SEQ ID NO: 265: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 54 amino acids 

(B) TYPE: amino acid 
(D) TOPOLOGY: linear 

(Xi) SEQUENCE DESCRIPTION: SEQ ID NO: 265: 



Met Leu Leu Gin He His Pro Leu Leu Pro Ser Pro Thr He Pro His 
15 10 15 

He Leu Leu Leu Phe Leu Tyr Pro Thr Phe Ser He Leu Glu His Ser 
20 25 30 

Cys Ser Tyr Cys He Glu Tyr Leu Trp Vai Cys Leu Leu Phe Cys Leu 
35 40 45 



Ser Leu Trp Phe Leu Xaa 

50 



(2) INFORMATION FOR SEQ ID NO: 266: 
(i) SEQUENCE CHARACTERISTICS: 
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LENGTH : 29 ammo acids 
!5' TYPE : amino acid 
TOPOLOGY: linear 
:>:i! SZQ'JEZICE DESCRIPTION: SEJj ID NO: 266: 

Met Cys Leu Trp Cys Cys Gly Asp Val Cys Ser Gly Leu Ser Ser Leu 
1 5 10 15 

Leu Ser Leu Cys Val Cys Cys Val Val Leu A. a Val Cys 
20 25 



(2^ INFORMATION FCR SEQ ID NO: 2 5": 

(i) SEQUENCE CHARACTERISTICS : 

(A) LENGTH: 26 amino acids 

(B) TYPE: amino acid 
!D) TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTION : SEQ ID NO: 267: 

Glu Gly Leu Arg Leu Leu Leu Ser Leu Pro Ala Ala Leu Pro Arg Ser 
15 10 15 

Cys Cys His Pro Arg Trp Leu Pro Val Xaa 

20 25 



(2) INFORMATION FOR SEQ ID NO: 268: 

(l) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 221 amino acids 

(B) TYPE: amino acid 
(D) TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO : 268: 

Met Phe His Gly lie Pro Afa Thr Pro Gly He Gly Ala Pro Gly Asn 
15 10 15 

Lys Pro Glu Leu Tyr Glu Glu Val Lys Leu Tyr Lys Asn Ala Arg Glu 
20 25 30 

Arg Glu Lys Tyr Asp Asn Met Ala Glu Leu Phe Ala Val Val Lys Thr 
35 40 45 

Met Gin Ala Leu Glu Lys Ala Tyr He Lys Asp Cys Val Ser Pro Ser 
50 55 60 

Glu Tyr Thr Ala Ala Cys Ser Arg Leu Leu Val Gin Tyr Lys Ala Ala 
65 70 75 80 

Phe Arg Gin Val Gin Gly Ser Glu He Ser Ser He Asp Glu Phe Cys 
85 90 95 

Arg Lys Phe Arg Leu Asp Cys Pro Leu Ala Met Glu Arg He Lys Glu 

100 105 110 



Asp Arg Pro He Thr He Lys Asp Asp Lys Gly Asn Leu Asn Arg Cys 
115 120 125 
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He Ala Asp Val Val Ser Leu Phe 
130 135 

Leu Glu He Arg Ala Met Asp 3lu 
145 150 

Met Glu Thr Met His Arg Met Ser 
165 

Arg Gin Thr Val Ser Gin Trp Leu 
180 

Ser Asp Glu Leu Asp Asp Ser Gin 
195 200 

Glu Ser Ala Tyr Asn Ala Phe Asn 
210 215 



He Tnr Val Met Asp Lys Leu Arg 
140 

He Gin Pro Asp Leu Arg Glu Leu 

155 160 

His Leu Pro Pro Asp Phe Glu Gly 
170 175 

Gin Thr Leu Ser Gly Met Ser Ala 
185 190 

Val Arg Gin Met Leu Phe Asp Leu 

205 

Arg Phe Leu His Ala 

220 



(2) INFORMATION FOR SEQ ID NC: 2 69: 

(l) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 3 amino acids 
(3) TYPE: amino acid 
(D) TOPOLOGY : linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 269: 

Met Lys Xaa 
1 



(2) INFORMATION FOR SEQ ID NO : 27 0: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 49 amino acids 

(B) TYPE: amino acid 
{ D ) TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTION : SEQ ID NO: 2"?0: 

Met Gin Ala Pro Phe Xaa His Phe Ser Phe Arg Met Phe Ser Asn Leu 
15 10 15 

Tyr Cys Phe Ser Asp Phe Gin Pro Asn He Ser Pro Cys Pro Leu Cys 

20 25 30 

His Cys He Leu Pro Xaa His His His Val Phe Leu Leu Leu Ala Val 
35 40 45 

Xaa 



(2) INFORMATION FOR SEQ ID NC : 271: 



(l) SEQUENCE CHARACTERISTICS: 

(A) LENGTH : 52 amino acids 



WO 98/54963 




(3; TYPE: amino acio 
ID) TCPCLCGY: linear 
(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 271 ■ 

Met Lys Leu Val Thr Met Phe Asp Lys Leu Ser Arg Asn Arg Val lie 
1 " 5 10 15 

Gin Pro Met Gly Met Ser Pro Arg Sly His Leu Thr Ser Leu Gin Asp 
20 25 30 

Ala Met Cys Glu Thr Met Glu Gin Gin Leu Ser Ser Asp Pro Asp Ser 
3 5 40 4 5 

Asp Pro Asp Xaa 

50 



(2) INFORMATION FOR SEQ ID NO: 272: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 3 2 ammo acids 

(B) TYPE: amino acid 
(D) TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 272: 

Met Ala Val Gly Glu Ala Val Phe Val Pro Leu Gin His Pro Pro Leu 

15 1C 15 

Leu Kis Gly Ser Pro He Pro Lys Leu Leu Pro Gly Pro Leu Leu Xaa 
20 25 30 



(2) INFORMATION FOR SEQ ID NO: 273: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 57 amino acids 

(B) TYPE: amino acid 
(D) TOPOLOGY : linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 273: 

Met Asn Gly Cys His Arg Arg Lys Arg Leu His Leu Cys Lys Thr He 
15 10 15 

Tyr Leu Leu Trp Phe Val Phe Ser Phe Leu Leu Ser Asn Glu Val Val 
20 25 30 

Ser Ser His Trp His He Leu Arg Ala Val Gin He He Cys Thr Leu 
35 40 45 

Phe His Arg Xaa He Ser Ala Phe Xaa 

50 55 



(2) INFORMATION FOR SEQ ID NO: 274: 
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T71S98/I1422 



509 



(l) SEQUENCE CHARACTERISTICS: 

I A) LENGTH: 22 amine acids 
(31 TYPE: ar.mc acid 
5 ( D) TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID KC: 2^4: 

Met Gly Trp Val Ser Ser Pro His Val Lys Arg Arq Glu Cys Val Leu 
15 1C 15 

10 

Lys Lys Pro Phe Phe Xaa 

23 



15 

(2) INFORMATION FOR SEQ ID NO: 275: 

(l) SEQUENCE CHARACTERISTICS : 

(A) LENGTH: 51 amino acids 
20 (B) TYPE: amino acid 

(D) TOPOLOGY : linear 
(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 2^5: 

Met Phe Asn Phe Phe Lys Asn Pro Leu Leu Thr Cys Leu Phe lie Ser 
25 1 5 10 15 

Cys Tyr Leu Tyr Leu Ser Leu Leu Val Asn Lys Val Leu Phe Ala Glu 

20 25 30 

30 Glu Gly Leu Cys Cys Thr Tyr Cys Thr Thr Ser Asn Thr Gly Glu Gly 
35 40 45 

Gly Val Xaa 
50 

35 



(2) INFORMATION FOR SEQ ID NO: 276: 

40 (i) SEQUENCE CHARACTERISTICS : 

(A) LENGTH: 2 amino acids 

(B) TYPE: amino acid 
(D) TOPOLOGY : linear 

(Xi) SEQUENCE DESCRIPTION : SEQ ID NO: 276: 

45 

Met Xaa 

1 



50 

(2) INFORMATION FOR SEQ ID NO: 277: 

(l) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 66 amino acids 
55 (B) TYPE: amino acid 

(D) TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 27^: 



Met Leu Cys Thr He Leu Thr Val Val He He He Ala Ala Gin Thr 
60 1 5 10 15 



